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ATTACKS  AND  COUNTERMEASURES 
IN  COMMUNICATIONS  AND  POWER  NETWORKS 
Jinsub  Kim,  Ph.D. 

Cornell  University  2014 

The  threat  of  malicious  network  attacks  has  become  significant  ever  since  net¬ 
working  became  pervasive  in  our  life.  When  adversaries  have  enough  control  over 
the  network  measurements  and  control  procedures,  the  effect  of  attacks  can  be  as 
detrimental  as  the  breakdown  of  the  whole  network  operations.  This  dissertation 
studies  possible  adversarial  effects  under  certain  protection  strategy,  the  condi¬ 
tions  under  which  attacks  can  be  detected,  and  protection  strategies  to  render 
attacks  detectable.  Specifically,  attacks  on  two  types  of  networks  are  considered: 
communications  networks  and  power  networks. 

First,  we  consider  an  attack  on  communications  networks,  where  a  pair  of  nodes 
are  suspected  to  belong  to  the  chain  of  compromised  nodes  used  by  the  adversary. 
If  the  pair  belongs  to  the  compromised  chain,  it  forwards  attack  packets  along  the 
chain,  and  thus  there  should  exist  an  information  flow  between  the  pair.  Detection 
of  an  information  flow  based  on  node  transmission  timings  is  formulated  as  a  binary 
composite  hypothesis  testing.  An  unsupervised  and  nonparametric  detector  with 
linear  complexity  is  proposed  and  tested  with  real-world  TCP  traces  and  MSN 
VoIP  traces.  The  detector  is  proved  to  be  consistent  for  a  class  of  nonhomogeneous 
Poisson  processes. 

Secondly,  the  topology  attack  on  power  networks  is  studied.  In  a  so-called  man- 
in-the-middle  topology  attack,  an  adversary  alters  data  from  certain  meters  and 
network  switches  to  mislead  the  control  center  with  an  incorrect  network  topology 
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while  avoiding  detection  by  the  control  center.  A  necessary  and  sufficient  condition 
for  the  existence  of  an  undetectable  attack  is  obtained,  and  countermeasures  to 
prevent  undetectable  attacks  are  presented.  It  is  shown  that  any  topology  attack 
is  detectable  if  a  set  of  meters  satisfying  a  certain  branch  covering  property  are 
protected  from  adversarial  data  modification.  The  proposed  attacks  are  tested  with 
IEEE  14-bus  and  IEEE  118-bus  system,  and  their  effect  on  real-time  locational 
marginal  pricing  is  examined. 

Lastly,  a  new  attack  mechanism  aimed  at  misleading  the  power  system  control 
center  about  the  source  of  data  attacks  is  proposed.  As  a  man-in-the-middle  state 
attack,  a  data  framing  attack  is  proposed  to  exploit  the  bad  data  detection  and 
identification  mechanisms  at  the  control  center.  In  particular,  the  proposed  attack 
frames  normal  meters  as  sources  of  bad  data  and  causes  the  control  center  to  re¬ 
move  useful  measurements  from  the  framed  meters.  The  optimal  design  of  data 
framing  attack  is  formulated  as  a  quadratically  constrained  quadratic  program 
(QCQP).  It  is  shown  that  the  proposed  attack  is  capable  of  perturbing  the  power 
system  state  estimate  by  an  arbitrary  degree  using  only  half  of  the  critical  mea¬ 
surements.  Implications  of  this  attack  on  power  system  operations  are  discussed, 
and  the  attack  performance  is  evaluated  using  benchmark  systems. 
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CHAPTER  1 

INTRODUCTION 

1.1  Motivation  and  Overview 

Since  the  advent  of  computer  networks,  networks  among  people  and  devices  have 
grown  rapidly  in  their  sizes  and  capabilities.  Nowadays,  the  majority  of  people 
are  connected  to  cellular  or  computer  networks  most  of  time,  and  our  reliance 
on  communications  networks  has  never  been  more  tremendous.  In  addition  to 
communications  networks,  power  networks  assume  an  extremely  crucial  role  in 
supporting  our  daily  life:  power  networks  enable  reliable  delivery  of  electricity  to 
our  homes,  work  places,  and  physical  infrastructures. 

For  proper  operations,  a  network  has  to  be  protected  from  possible  attacks.  As 
the  role  of  networks  became  important,  potential  effects  of  network  attacks  also 
became  significant.  For  instance,  an  adversary  in  a  data  network  may  hack  into  a 
server  to  attain  unauthorized  data  thereby  possibly  causing  privacy  data  leakage. 
In  power  networks,  its  cyber-physical  nature  allows  an  adversary  to  create  even 
worse  consequences.  For  instance,  an  adversary  may  alter  meter  data  to  mislead 
the  control  center  about  the  current  operating  condition.  Such  an  attack  may 
possibly  leads  to  breakdown  of  power  plants,  electricity  price  perturbation,  and 
even  a  blackout  in  the  worst  case.  Such  attacks  on  networks  have  been  continuously 
reported  thereby  proving  the  presence  of  threats. 

Fortunately,  many  attacks  leave  traces  in  the  network  measurements  (e.g.,  net¬ 
work  log  hies,  measurements  from  deployed  sensors.)  However,  it  is  nontrivial  how 
to  detect  presence  of  an  attack  based  on  the  measurements.  Smart  adversaries  will 
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attempt  to  hide  their  traces,  and  it  is  indeed  possible  if  they  have  enough  controls 
on  the  network  or  the  measurements.  Furthermore,  sometimes,  an  attack  may 
not  leave  a  strong  signature  in  the  network  measurements.  Therefore,  a  detection 
algorithm  needs  to  be  carefully  designed,  and  the  fundamental  limitation  due  to  a 
strong  adversary  needs  to  be  studied.  This  dissertation  studies  protection  strate¬ 
gies  to  render  attacks  detectable  and  conditions  under  which  a  smart  adversary 
can  launch  an  attack  without  leaving  any  detectable  trace. 

We  first  consider  an  attack  on  communications  networks  in  Chapter  2.  We 
consider  the  so-called  stepping  stone  attack  [1] ,  in  which  the  attacker  uses  a  chain 
of  compromised  nodes  to  access  the  victim.  This  strategy  is  often  used  to  confuse 
the  intrusion  detection  system  about  the  adversary’s  location.  If  the  adversary 
compromises  a  pair  of  supposedly  independent  nodes  and  use  them  as  stepping 
stones  to  the  victim,  there  should  exist  information  flows  (associated  with  the 
attack)  between  the  pair.  Given  suspect  nodes  for  stepping  stones,  detection  of 
information  flows  can  be  applied  to  trace  back  the  stepping  stone  chain.  We 
formulate  the  problem  as  a  binary  composite  hypothesis  testing  and  present  an 
unsupervised  and  nonparametric  detection  algorithm. 

Then,  we  move  to  attacks  on  power  networks  in  Chapter  3.  In  a  power  network, 
the  control  center  periodically  collects  measurements  from  meters  and  sensors  de¬ 
ployed  throughout  the  network.  These  measurements  are  used  in  estimating  the 
real-time  system  state  and  the  network  topology.  We  study  a  specific  type  of  data 
attack  that  alters  part  of  the  measurements  to  mislead  the  control  center  with  an 
incorrect  network  topology.  Such  attacks  may  cause  the  control  center  to  believe 
in  false  contingency  information  or  delay  preventive  actions  when  important  trans¬ 
mission  lines  are  tripped.  Unless  the  adversary  can  control  a  sufficient  number  of 
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measurements,  an  attempt  to  disturb  the  topology  estimate  causes  inconsistency 
among  the  measurements,  and  this  anomaly  can  be  well  detected  by  the  legacy 
bad  data  test.  We  provide  a  necessary  and  sufficient  condition  under  which  an 
attack  can  be  successful  without  causing  any  detectable  anomaly  and  present  con¬ 
struction  of  undetectable  attacks.  Then,  the  necessary  and  sufficient  condition  is 
used  to  develop  a  graph-theoretical  meter  protection  strategy. 

In  Chapter  4,  a  data  framing  attack  on  power  system  state  estimation  is  pre¬ 
sented.  The  framing  attack  is  a  new  approach  of  data  attack  on  state  estimation 
which  misleads  the  control  center  that  certain  normally  operating  meters  are  re¬ 
sponsible  for  generating  biased  measurements.  The  bad  data  identification  rule 
falsely  identifies  the  data  from  these  meters  as  bad  and  remove  them  from  system 
state  estimation.  Such  an  attack  may  degrade  the  accuracy  of  state  estimation 
and  even  make  the  network  vulnerable  to  arbitrary  perturbation  of  the  state  esti¬ 
mate.  We  formulate  the  optimal  design  of  the  framing  attack  as  a  quadratically 
constrained  quadratic  program  and  show  that  the  framing  attack  needs  to  alter 
only  half  of  the  critical  set  of  measurements  to  perturb  the  state  estimate  by  an 
arbitrary  degree.  The  proposed  attack  is  evaluated  with  the  IEEE  14-bus  and 
118-bus  networks. 

The  following  three  sections  give  more  details  about  related  works  and  our 
contributions  in  the  aforementioned  three  topics. 


1.2  Detection  of  Information  Flows 

Detection  of  information  flows  between  a  pair  of  nodes  has  been  studied  in  the 
context  of  network  intrusion  detection,  especially  in  the  detection  of  interactive 
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stepping-stone  attacks  [1].  The  use  of  only  transmission  timing  measurements  for 
detection  is  motivated  by  the  fact  that  packets  involved  in  an  attack  can  be  easily 
encrypted.  Even  though  transmission  timings  of  nodes  can  be  easily  monitored, 
detecting  information  flows  based  on  timing  is  non-trivial.  One  main  source  of 
difficulty  is  the  presence  of  noise-like  epochs.  When  an  information  flow  exists 
between  the  two  nodes,  the  two  nodes  may  have  transmissions  that  do  not  belong 
to  the  flow.  They  may  multiplex  transmissions  of  other  flows  that  go  through  only 
one  of  the  two  nodes,  or  intentionally  superpose  dummy  transmissions  to  avoid 
detection.  We  refer  to  such  transmissions  as  chaff  transmissions. 

1.2.1  Related  Works 

Donoho  et  al.  [1]  were  among  the  first  to  consider  the  flow  model  with  a  uniform 
delay  bound.  Following  their  model,  many  algorithms  have  been  proposed  to 
detect  a  flow  with  a  delay  constraint.  As  an  active  detection  scheme,  Wang  et 
al.  [2]  proposed  a  watermark-based  detector  which  embeds  watermarks  by  slightly 
adjusting  transmission  timings  of  a  node;  if  the  same  watermarks  are  detected  in 
another  node,  two  nodes  are  claimed  to  have  flows  between  them.  Their  work  was 
followed  by  a  large  number  of  watermark-based  detectors  [3-10].  The  insertion 
of  watermarks,  however,  requires  the  ability  of  the  detector  to  modify  traffic  at 
different  locations  of  the  network,  which  may  not  be  possible  in  practical  situations. 

If  the  network  traffic  cannot  be  modified  to  facilitate  detection,  the  problem  is 
referred  to  as  passive  flow  detection,  and  it  is  the  problem  of  our  interest.  In  pas¬ 
sive  detection,  a  detector  collects  transmission  timing  measurements  and  analyzes 
them  to  draw  a  conclusion  about  presence  of  a  flow.  Many  research  efforts  have 
been  made  to  develop  effective  passive  detectors.  Zhang  et  al.  [11, 12]  proposed 
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matching-based  algorithms.  However,  they  assumed  that  only  one  of  two  nodes 
can  insert  chaff  transmissions,  and  their  algorithms  are  vulnerable  to  chaff  inser¬ 
tion  at  both  nodes.  Donoho  et  al.  [1]  proposed  a  wavelet  analysis  with  a  claim  that 
it  can  detect  a  flow  in  chaff  if  the  chaff  part  is  independent  of  the  flow  part  and 
the  sample  size  is  sufficiently  large.  Blum  et  al.  [13]  presented  a  counting-based 
method  which  was  shown  to  be  able  to  detect  a  flow  in  chaff  if  the  fraction  of 
chaff  is  small  enough.  Under  the  Poisson  traffic  assumption,  they  characterized 
the  sufficient  sample  size  for  satisfying  a  given  false  alarm  probability  constraint. 
However,  their  method  may  result  in  high  miss  detection  probability  if  chaff  trans¬ 
missions  are  bursty.  He  and  Tong  [14]  proposed  a  matching-based  detector  with 
better  chaff  tolerance  and  characterized  the  maximum  tolerable  fraction  of  chaff 
under  the  homogeneous  Poisson  traffic  assumption.  Their  approach  requires  choos¬ 
ing  a  detection  threshold  which  is  a  function  of  the  parameter  of  the  underlying 
Poisson  traffic.  When  the  traffic  deviates  from  the  Possion  model,  the  detection 
algorithm  is  not  always  robust.  The  approach  in  [14]  can  be  applied  to  the  general 
traffic  if  a  training  data  with  a  sufficiently  long  time  span  is  available.  Coskun  and 
Mernon  [15, 16]  presented  detectors  based  on  random  projection  of  transmission 
processes.  Similar  to  [14],  their  methods  also  require  choosing  an  appropriate  de¬ 
tection  threshold,  which  can  be  successful  only  if  a  large  volume  of  training  data 
or  an  accurate  parametric  model  is  available. 

1.2.2  Contributions 

Our  results  include  three  parts:  a  nonparametric  passive  detection  algorithm  for 
unidirectional  or  bidirectional  flows,  the  related  performance  analysis,  and  exper¬ 
iments  with  synthetic  and  real  data.  In  developing  an  algorithm,  our  main  contri- 
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bution  is  a  new  nonparametric  technique  that  does  not  rely  on  knowledge  of  traffic 
distribution;  nor  does  it  require  a  training  data  for  either  hypothesis.  The  key  idea 
lies  in  a  particular  transformation  of  the  measurements  that  leads  to  distinct  sta¬ 
tistical  behaviors  under  two  different  hypotheses.  The  proposed  detector  does  not 
assume  stationarity  of  traffic  and  hence  is  applicable  in  time- varying  traffic  condi¬ 
tions.  Furthermore,  it  is  memory-efficient  and  has  linear  computational  complexity 
with  respect  to  the  sample  size  thereby  making  real-time  inference  feasible. 

In  algorithm  analysis,  we  aim  to  give  theoretical  justifications  for  the  proposed 
approach.  To  this  end,  we  establish  the  consistency  property  of  the  proposed 
detector  for  a  class  of  non-homogeneous  Poisson  traffic.  Even  though  the  detector  is 
analyzed  only  for  non-homogeneous  Poisson  traffic,  the  intuition  behind  it  suggests 
that  it  may  perform  well  on  the  traffic  with  more  general  distribution. 

The  performance  of  our  detector  is  evaluated  using  synthetic  Poisson  traffic, 
LBL  TCP  traces  [17],  and  real-world  measurements  from  MSN  VoIP  sessions,  and 
comparison  with  other  passive  detectors  is  provided.  The  use  of  synthetic  data 
allows  us  to  examine  the  trade-offs  between  miss  detection  and  false  alarm  prob¬ 
abilities  using  Monte  Carlo  simulations.  LBL  TCP  traces  and  MSN  VoIP  traces 
are  of  course  not  guaranteed  to  satisfy  the  assumptions  made  in  our  algorithm 
analysis,  and  our  results  indicate  a  level  of  robustness. 


1.3  Topology  Attack  of  a  Power  Grid 

Liu,  Ning,  and  Reiter  [18]  appear  to  be  the  first  to  introduce  the  concept  of  data 
injection  attack  (also  referred  to  as  malicious  data  attack)  of  a  power  grid.  As¬ 
suming  that  the  attacker  is  capable  of  altering  data  from  a  set  of  meters,  a  similar 


6 


22 


scenario  assumed  in  our  problem  setting,  the  authors  of  [18]  show  that  if  the  set 
of  compromised  meters  satisfies  certain  condition,  the  adversary  can  perturb  the 
network  state  by  an  arbitrarily  large  amount  without  being  detected  by  any  detec¬ 
tor.  In  other  words,  the  data  attack  considered  in  [18]  is  undetectable.  The  main 
difference  between  [18]  and  our  work  is  that  the  attacks  considered  in  [18]  perturb 
only  the  network  state,  not  the  network  topology.  It  is  thus  most  appropriate  to 
refer  to  attacks  in  [18]  and  many  follow-ups  as  state  attack ,  in  distinguishing  the 
topology  attack  considered  in  our  work. 

1.3.1  Related  Works 

The  work  in  [18]  is  influential;  it  has  inspired  many  further  developments,  e.g., 
[19-22]  and  references  therein,  all  focusing  on  state  attacks.  A  key  observation 
is  made  by  Kosut  et  al.  in  [23,24],  showing  that  the  condition  of  non-existence 
of  an  undetectable  attack  is  equivalent  to  that  of  network  observability  [25,26]. 
This  observation  leads  to  graph  theoretic  techniques  that  characterize  network 
vulnerability  [24] .  The  condition  to  be  presented  in  Chapter  3  on  the  non-existence 
of  an  undetectable  topology  attack  mirrors  the  state  attack  counterpart  in  [24], 

The  problem  of  adding  protection  on  a  set  of  meters  to  prevent  undetectable 
state  attacks  was  considered  by  Bobba  et  al.  [19] .  We  consider  the  same  problem  in 
the  context  of  topology  attack.  While  meter  protection  problem  for  state  attacks 
is  equivalent  to  protecting  a  sufficient  number  of  meters  to  ensure  observability 
[19,24],  the  corresponding  problem  for  topology  attacks  is  somewhat  different  and 
more  challenging. 

The  problem  of  detecting  topology  error  from  meter  data  is  in  fact  a  classical 
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problem,  casted  as  part  of  the  bad  data  detection  problem  [27-29].  Monticelli  [30] 
pioneers  the  so-called  generalized  state  estimation  approach  where,  once  the  state 
estimate  fails  the  bad  data  test,  modifications  of  topology  that  best  represent  the 
meter  data  are  considered.  Abur  et  al.  [31]  extend  this  idea  to  the  least  absolute 
value  state  estimation  formulation,  and  Mili  et  al.  [32]  apply  the  idea  to  the  state 
estimation  with  the  Huber  M-estimator.  Extensive  works  followed  to  improve 
computational  efficiency,  estimation  accuracy,  and  convergence  property  over  the 
aforementioned  methods  (e.g.,  see  [33-35]  and  references  therein). 

Finally,  there  is  a  limited  discussion  on  the  impact  of  a  malicious  data  attack  on 
power  system  operations.  Should  state  estimates  be  used  in  closed-loop  control  of 
the  power  grid,  such  an  attack  may  cause  serious  stability  problems.  The  current 
state  of  the  art,  however,  uses  state  estimates  for  real-time  dispatch  only  in  a 
limited  fashion.  However,  state  estimates  are  used  extensively  in  calculating  real¬ 
time  locational  marginal  price  (LMP)  [36].  Thus,  attacks  that  affect  state  estimates 
will  affect  the  real-time  LMP  calculation  [37-39] .  The  way  that  a  topology  attack 
affects  LMP  is  significantly  different  from  that  of  a  state  attack.  We  demonstrate 
that  a  topology  attack  has  significant  impact  on  real-time  LMP. 

1.3.2  Contributions 

First,  we  characterize  conditions  under  which  undetectable  attacks  are  possible, 
given  a  set  of  vulnerable  meters  that  may  be  controlled  by  an  adversary.  To  this 
end,  we  consider  two  attack  regimes  based  on  the  information  set  available  to  the 
attacker.  The  more  information  the  attacker  has,  the  stronger  its  ability  to  launch 
a  sophisticated  attack  that  is  hard  to  detect. 
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The  global  information  regime  is  where  the  attacker  can  observe  all  meter  and 
network  data  before  altering  the  adversary-controlled  part  of  them.  Although  it 
is  unlikely  in  practice  that  an  adversary  is  able  to  operate  in  such  a  regime,  in 
analyzing  the  impact  of  attacks,  it  is  typical  to  consider  the  worst  case  by  granting 
the  adversary  additional  power.  We  present  a  necessary  and  sufficient  algebraic 
condition  under  which,  given  a  set  of  adversary  controlled  meters,  there  exists  an 
undetectable  attack  that  misleads  the  control  center  with  an  incorrect  “target” 
topology.  This  algebraic  condition  provides  not  only  numerical  ways  to  check  if 
the  grid  is  vulnerable  to  undetectable  attacks  but  also  insights  into  which  meters  to 
protect  to  defend  against  topology  attacks.  We  also  provide  specific  constructions 
of  attacks  and  show  certain  optimality  of  the  proposed  attacks. 

A  more  practically  significant  situation  is  the  local  information  regime  where 
the  attacker  has  only  local  information  from  those  meters  it  has  gained  control. 
We  present  that  under  certain  conditions,  undetectable  attacks  exist  and  can  be 
implemented  easily  based  on  simple  heuristics. 

Secondly,  we  study  conditions  under  which  any  topology  attack  can  be  made 
detectable.  Such  a  condition,  even  if  it  may  not  be  the  tightest,  provides  insights 
into  defense  mechanisms  against  topology  attacks.  We  show  that  if  a  set  of  meters 
satisfying  a  certain  branch  covering  property  are  protected,  then  topology  attacks 
can  always  be  detected. 


1.4  Framing  Attack  on  State  Estimation 

In  power  system  state  estimation,  it  is  well  known  that  bad  data  identification 
rules  may  mistakenly  identify  good  data  entries  as  bad  and  remove  them  [40,41], 
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We  study  how  such  an  inherent  weakness  of  the  bad  data  test  can  be  exploited  by 
adversaries. 

1.4.1  Related  Works 

We  consider  a  man-in-the-middle  attack,  where  an  adversary  can  alter  part  of  meter 
measurements  such  that  the  control  center  is  misled  with  the  partially  corrupt 
measurements. 

In  [18],  Lin,  Ning,  and  Reiter  presented  perhaps  the  first  man-in-the-middle 
(MiM)  attack  on  the  power  system  state  estimation  where  an  adversary  replaces 
“normal”  sensor  data  with  “malicious  data.”  It  was  shown  that,  if  the  adversary 
could  gain  control  of  a  sufficient  number  of  meters,  it  could  perturb  the  state 
estimate  by  an  arbitrary  amount  without  being  detected  by  the  bad  data  detector 
employed  at  the  control  center.  Such  undetectable  attacks  are  referred  to  as  covert 
data  attacks. 

There  is  an  extensive  literature  on  covert  data  attacks,  following  the  work  of 
Liu,  Ning,  and  Reiter  [18].  While  the  data  framing  attack  mechanism  proposed 
here  is  fundamentally  different,  insights  gained  in  existing  work  are  particularly 
relevant.  Here,  we  highlight  some  of  these  ideas  in  the  literature. 

The  explicit  link  between  covert  attack  on  state  estimation  and  system  ob¬ 
servability  was  made  in  [19,23].  Consequently,  classical  observability  conditions 
[25,26,42]  can  be  modified  for  that  for  covert  attacks  and  used  to  develop  meter 
protection  strategies  [19,24,43-46].  A  particularly  important  concept  is  the  no¬ 
tion  of  critical  set  of  meters  (or  critical  measurements)  [26,47,48].  In  assessing 
the  vulnerability  of  the  grid,  the  minimum  number  of  adversary  meters  necessary 
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for  a  covert  attack  was  suggested  as  the  security  index  for  the  grid  [20,24],  Sub¬ 
sequently,  meter  protection  strategies  were  proposed  in  [21,  22]  to  maximize  the 
security  index  under  the  protection  resource  constraint. 

The  framing  attack  strategy  considered  here  relies  on  bad  data  identification 
and  removal  techniques  that  have  long  been  subjects  of  study  [40,41,47,49,50],  See 
[51,52]  and  references  therein.  Typically,  the  residue  vectors  in  normalized  forms 
are  widely  used  as  statistics  for  the  bad  data  test  [40] .  In  particular,  Mill  et  al.  [50] 
proposed  a  hypothesis  testing  method,  in  which  the  set  of  suspect  measurements 
are  determined  by  the  residue  analysis  in  [40],  The  use  of  non-quadratic  cost 
functions  in  state  estimation  was  also  studied  to  enhance  the  bad  data  identification 
performance.  Especially,  the  weighted  least  absolute  value  estimation  [53-56]  and 
the  least  median  of  squares  regression  [57,58]  were  considered  as  alternatives  with 
comparably  good  performance.  In  this  dissertation,  we  take  the  residue  analysis 
in  [40]  as  a  representative  bad  data  test  and  analyze  the  effect  of  the  framing 
attack.  However,  the  same  analysis  is  applicable  to  general  bad  data  tests. 

Detection  of  data  attacks  on  state  estimation,  referred  to  as  state  attacks,  has 
been  also  studied  in  various  frameworks.  Kosut  et  al.  [24]  presented  a  generalized 
likelihood  ratio  test  for  detection.  Morrow  et  al.  [59]  proposed  the  detection  mech¬ 
anism  based  on  network  parameter  perturbation  which  deliberately  modifies  the 
line  parameters  and  probes  whether  the  measurements  respond  accordingly  to  the 
modification.  Distributed  detection  and  estimation  of  adversarial  perturbation  was 
also  studied  in  [60] .  In  an  effort  to  minimize  the  detection  delay,  the  attack  detec¬ 
tion  was  also  formulated  as  a  quickest  detection  problem,  and  modified  CUSUM 
algorithms  were  proposed  [61-63]. 
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1.4.2  Contributions 

We  propose  a  data  framing  attack  on  power  system  state  estimation.  Specifi¬ 
cally,  we  formulate  the  design  of  optimal  data  framing  attack  as  a  quadratically 
constrained  quadratic  program  (QCQP).  To  analyze  the  efficacy  of  the  data  fram¬ 
ing  attack,  we  present  a  sufficient  condition  under  which  the  framing  attack  can 
achieve  an  arbitrary  perturbation  of  the  state  estimate  by  controlling  only  half 
of  the  critical  set  of  meters.  We  demonstrate  with  the  IEEE  14-bus  and  118-bus 
networks  that  the  sufficient  condition  holds  in  critical  sets  associated  with  cuts. 

The  optimal  design  of  framing  attack  is  based  on  a  linearized  system.  In  prac¬ 
tice,  a  nonlinear  state  estimator  is  often  used.  We  demonstrate  that,  under  the 
nonlinear  measurement  model,  the  framing  attacks  designed  based  on  linearized 
system  model  successfully  perturb  the  state  estimate,  and  the  adversary  can  con¬ 
trol  the  degree  of  perturbation  as  desired. 


1.5  Organization 

In  Chapter  2,  we  consider  detection  of  an  attack-associated  information  flow  in 
communications  networks.  Specifically,  the  problem  is  formulated  as  detection  of 
an  information  flow  based  on  timing  measurements.  We  first  start  with  a  simpler 
case  of  detection  with  a  parametric  flow  model.  Then,  we  present  an  unsupervised 
and  nonparametric  flow  detector.  The  detector  is  proved  to  be  consistent  for  a 
class  of  non-homogeneous  Poisson  traffic  model.  Lastly,  the  detector  is  tested  and 
compared  with  other  benchmark  techniques  using  real-world  TCP  and  VoIP  traces. 

From  Chapter  3,  we  consider  data  attacks  in  power  networks.  A  data  attack 
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aimed  at  perturbing  the  topology  estimate  of  the  control  center  is  studied.  We  first 
study  the  attack  for  an  adversary  with  global  information.  A  necessary  and  suffi¬ 
cient  condition  for  an  undetectable  topology  attack  is  presented,  and  the  condition 
is  used  to  construct  a  simple  graph-theoretical  meter  protection  strategy.  Then, 
we  consider  an  adversary  with  local  information.  An  undetectable  local  attack  is 
presented  and  tested  with  the  IEEE  14-bus  and  118-bus  networks. 

In  Chapter  4,  we  study  a  data  framing  attack  on  power  system  state  estimation. 
We  present  a  new  attack  approach,  which  alters  the  adversary-controlled  measure¬ 
ments  deliberately  such  that  the  bad  data  detection  and  identification  rule  falsely 
removes  measurements  from  normally  operating  meters  while  retaining  adversari- 
ally  altered  measurements.  We  first  present  the  main  idea  of  the  attack  and  then 
provide  the  optimization  framework  for  the  attack  design.  Controlling  only  half  of 
a  critical  set  of  meters,  the  proposed  attack  is  shown  to  be  able  to  perturb  the  state 
estimate  by  an  arbitrary  degree.  The  numerical  results  with  the  IEEE  benchmark 
networks  are  provided. 

Finally,  Chapter  5  provides  concluding  remarks  and  comments  on  future  works. 
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CHAPTER  2 

DETECTION  OF  INFORMATION  FLOWS 

2.1  Introduction 

We  consider  the  problem  of  detecting  information  flows  through  a  pair  of  monitored 
nodes  as  illustrated  in  Fig.  2.1.  In  particular,  given  the  measurements  of  transmis¬ 
sion  timings  from  the  monitored  nodes,  we  are  interested  in  determining  whether 
the  two  monitored  nodes  are  engaged  in  relaying  packets  of  certain  information 
flows  (the  alternative  hypothesis),  or  they  are  merely  transmitting  independently 
(the  null  hypothesis).  The  network  of  our  interest  can  be  either  wireless  or  wired 
as  long  as  transmission  timings  can  be  measured. 

The  generic  problem  of  flow  detection  arises  from  a  number  of  practical  applica¬ 
tions,  especially  in  the  context  of  information  forensics,  network  surveillance,  and 
anonymous  networking.  For  example,  in  the  so-called  stepping-stone  attack  [1]  in 
a  network,  an  adversary  may  attack  a  node  by  compromising  a  sequence  of  nodes 
that  serve  as  stepping  stones.  When  the  attacker  is  involved  in  an  interactive  ses¬ 
sion  (e.g.,  SSH),  a  flow  of  packets  travel  through  a  chain  of  stepping  stones.  By 

Transmission  epochs: 

Nl  t  ttt  ft  t  t  t  tt  t  ft  ttt  P 

t  ttt  tt  t  t  ft  tt  ttt  t  t  t  : 

Mi  Transmission  monitor 

Figure  2.1:  In  the  above  wireless  network,  the  transmission  timings  of  two  nodes, 
Ni  and  N2,  are  recorded.  The  horizontal  axis  is  the  time  axis,  and  arrows  represent 
packet  transmissions  at  different  time  points.  As  illustrated,  packets  of  certain 
information  flows  may  travel  through  Ni  and  N2. 
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detecting  the  presence  of  unexpected  flows  through  monitored  nodes,  the  network 
owner  can  alert  the  possibility  of  an  attack.  Other  applications  include  the  detec¬ 
tion  of  wormhole  attack  [64]  in  which  a  set  of  colluding  nodes  divert  a  valid  network 
flow  through  a  “wormhole  tunnel.”  Understanding  the  problem  of  flow  detection 
is  also  valuable  for  the  design  and  assessment  of  anonymous  networks  [65,66]. 

We  restrict  ourselves  to  the  use  of  timing  measurements  only.  Such  a  restriction 
is  of  course  unnecessary  because  there  are  often  other  information  available  such  as 
source-destination  addresses,  packet  statistics,  etc.;  a  detector  should  incorporate 
such  side  information.  We  choose  to  focus  exclusively  on  the  use  of  timing  informa¬ 
tion  for  two  reasons.  First,  timing  can  only  be  distorted  but  cannot  be  hidden  by 
the  transmitter,  and  its  measurements  can  be  obtained  by  simple  devices.  In  con¬ 
trast,  source-destination  addresses  and  packet  characteristics  can  be  masked  using 
standard  techniques  in  anonymous  networking  [66] .  Second,  timing  is  a  fundamen¬ 
tal  traffic  characteristic.  It  is  therefore  useful  to  understand  the  extent  that  timing 
reveals  the  presence  of  information  flows.  Furthermore,  any  side  information,  when 
incorporated  properly,  will  enhance  the  performance  of  techniques  based  solely  on 
timing  information. 

Even  though  transmission  timings  of  nodes  can  be  easily  monitored,  detect¬ 
ing  information  flows  based  on  timing  measurements  is  non-trivial,  partly  because 
of  non-stationary  traffic  characteristics:  transmission  timings  of  nodes  often  have 
time-varying  intensities,  and  they  may  be  bursty  when  interactive  users  are  in¬ 
volved.  Moreover,  in  general,  it  is  difficult  to  obtain  an  accurate  parametric  model 
for  the  monitored  traffic,  especially  when  there  is  no  prior  knowledge  about  the 
nature  of  the  traffic  and  no  training  data  available.  The  presence  of  noise-like 
epochs  is  another  source  of  difficulty.  When  an  information  flow  travels  through 
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two  nodes,  the  two  nodes  may  have  transmissions  that  do  not  belong  to  the  flow. 
They  may  multiplex  transmissions  of  other  flows  that  go  through  only  one  of  the 
two  nodes,  or  intentionally  superpose  dummy  transmissions  to  avoid  detection.  We 
refer  to  the  epochs  of  such  transmissions  as  chaff  epochs. 

It  is  easy  to  see  that,  if  a  node  can  arbitrarily  delay  packets  in  a  flow,  timing 
information  is  insufficient  for  detection.  For  latency-sensitive  applications  such 
as  VoIP,  multimedia  streaming,  etc.,  however,  packets  must  satisfy  certain  end- 
to-end  delay  constraints,  which  make  the  presence  of  such  flows  detectable.  For 
instance,  VoIP  applications  require  end-to-end  delays  to  be  bounded  above  by  150 
msec  [67].  We  will  consider  the  constraint  that  flow  packets  should  satisfy  the 
end-to-end  delay  constraint  of  A  seconds. 

2.1.1  Summary  of  Results  and  Organization 

Our  results  include  three  parts:  a  nonparametric  flow  detection  algorithm  for  unidi¬ 
rectional  or  bidirectional  flows,  the  related  performance  analysis,  and  experiments 
with  synthetic  and  real  data.  In  developing  an  algorithm,  our  main  contribution  is 
a  new  nonparametric  technique  that  does  not  rely  on  knowledge  of  traffic  distribu¬ 
tion;  nor  does  it  require  a  training  data  for  either  hypothesis.  The  key  idea  lies  in 
a  particular  transformation  of  the  measurements  that  leads  to  distinct  statistical 
behaviors  under  two  different  hypotheses.  The  proposed  detector  does  not  assume 
stationarity  of  traffic  and  hence  is  applicable  in  time-varying  traffic  conditions. 
Furthermore,  it  is  memory-efficient  and  has  linear  computational  complexity  with 
respect  to  the  sample  size  thereby  making  real-time  inference  feasible. 

In  algorithm  analysis,  we  aim  to  give  theoretical  justifications  for  the  proposed 
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approach.  To  this  end,  we  establish  the  consistency  property  of  the  proposed 
detector  for  a  class  of  non-homogeneous  Poisson  traffic. 

The  performance  of  our  detector  is  evaluated  using  synthetic  Poisson  traffic, 
LBL  TCP  traces  [17],  and  real-world  measurements  from  MSN  VoIP  sessions,  and 
comparison  with  other  benchmark  passive  detectors  is  provided.  LBL  TCP  traces 
and  MSN  VoIP  traces  are  of  course  not  guaranteed  to  satisfy  the  assumptions 
made  in  our  algorithm  analysis,  and  our  results  indicate  a  level  of  robustness. 

The  rest  of  the  chapter  is  organized  as  follows.  Section  2.2  gives  the  notations 
and  definitions  employed  throughout  the  chapter  and  formulates  flow  detection 
as  a  binary  composite  hypothesis  testing  problem.  In  Section  2.3,  we  consider 
the  simpler  case  where  the  parametric  model  of  the  traffic  is  available.  Then, 
Section  2.4  presents  a  nonparametric  flow  detection  algorithm  and  its  consistency 
property.  In  Section  2.5,  the  proposed  detector  is  evaluated  using  synthetic  Poisson 
traffic,  LBL  TCP  traces,  and  MSN  VoIP  traffic.  The  proofs  of  theorems  are  given 
in  Section  2.6. 


2.2  Mathematical  Formulation 

This  section  introduces  notations  and  definitions  and  formulates  flow  detection  as 
one  of  binary  composite  hypothesis  testing. 
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2.2.1  Notations  and  Flow  Models 

Transmission  timings  of  eadi  node  are  modeled  as  a  point  process  on  [0,  oo),  and 
detectors  begin  recording  the  timings  at  time  0.  Bold  upper-case  letters  (e.g.,  S) 
denote  point  processes,  and  bold  lower-case  letters  (e.g.,  s)  denote  their  realiza¬ 
tions.  S(i)  represents  the  ith  epoch  (i.e.,  the  time  of  the  zth  transmission)  of  S,  and 
s(i)  is  its  realization.  The  upper-case  script  letter  S  denotes  the  set  of  epochs  in 
the  realization  s:  S  A  {s(i),  i  >  1}.  In  addition,  we  define  a  superposition  operator 
0:  given  two  increasing  sequences  (af)^  and  (0“i,  (aflZLi  ©  (bfl^  =  (cf)^, 
where  c*  is  the  ith  element  of  the  sequence  of  all  the  elements  of  and  (5^)°^ 

ordered  in  the  increasing  order. 

First,  we  define  a  unidirectional  flow  as  follow. 

Definition  2.2.1  An  ordered  pair  of  point  processes  (Fi,  F2)  forms  a  unidirec¬ 
tional  flow,  if  for  any  realization  (fi,  f2)  there  exists  a  bijection  g  :  3q  — >  T2 
satisfying  g(s )  —  s  G  [0,  A]  for  all  s  G  3q. 

As  illustrated  in  Fig.  2.2,  when  packets  of  an  information  flow  travel  through 
node  Ni  and  node  N2,  F j  and  F2  can  be  interpreted  as  the  transmission  timings 
of  the  flow  packets  at  Ni  and  N2  respectively.  The  bijection  condition  of  g  means 
packet  conservation;  every  flow  packet  sent  by  Ni  is  received  and  forwarded  by  N2. 
The  condition  g(s)  —  s  G  [0,  A]  means  that  every  flow  packet  transmission  satisfies 
causality  and  the  delay  constraint  A.  Based  on  the  above  definition,  we  define 
a  bidirectional  flow  as  a  superposition  of  two  unidirectional  flows  with  opposite 
directions. 

Definition  2.2.2  A  pair  of  point  processes  (Fi,  F2)  forms  a  bidirectional  flow,  if 
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Figure  2.2:  Every  packet  transmission  of  a  unidirectional  flow  is  assumed  to  satisfy 
packet  conservation,  causality,  and  the  delay  constraint  A. 

F;-  can  be  decomposed  into  Fp  and  Ff1  (i.e.,  F*  =  FP  ©  Ffl)  such  that  (Fj2,  FP) 
and  (Fp.  Fp)  are  unidirectional  flows. 


We  allow  (Fp.  F^2)  and  (F|p  Fp)  to  have  zero  rate,  so  that  a  unidirectional 
flow  is  a  special  case  of  a  bidirectional  flow. 


2.2.2  Problem  Statement 


We  formulate  detection  of  bidirectional  flow  as  a  binary  composite  hypothesis 
testing  problem.  Let  Si  and  S2  denote  the  transmission  processes  of  Ni  and 
N2,  respectively.  Given  the  measurements  (sp2=1  in  [0,  t],  we  test  the  following 
hypotheses: 

"Hn  :  Si  and  S9  are  independent: 

(2.1) 

"Hi  :  Sj  =  Fj  ©  W i,  i  =  1,2,  and  (Fi,  F2)  forms  a  bidirectional  flow. 

We  further  assume  that,  under  "Hi, 


1.  Fi  and  F2  are  point  processes  with  non-zero  rates1. 

2.  Fi  and  F2  are  not  independent. 


1In  other  words,  if  Npflt)  denotes  the  number  of  epochs  of  in  [0,  t],  there  exists  <5  >  0  such 

that  liminf  — kPP  >  <5  almost  surely,  i  =  1,2. 

£—>■00  t 
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3.  (F1;  F2),  Wb  and  W2  are  independent. 

"Ho  corresponds  to  the  scenario  that  Ni  and  N2  have  independent  transmissions. 
"Hi  corresponds  to  the  scenario  that  Ni  and  N2  relay  packets  of  information  flows 
in  either  or  both  directions:  and  (Wj)-=1  represent  the  flow  part  and  the 

chaff  part,  respectively.  Note  that  under  both  hypotheses,  no  restriction  is  imposed 
on  the  marginal  distributions  of  Sj,  F?:,  and  W?:. 

The  assumptions  under  FLi  are  imposed  to  make  two  hypotheses  disjoint.  The 
first  assumption  implies  that  the  bidirectional  flow  should  have  positive  rate.  The 
second  assumption  means  that  the  flow  parts  of  Ni  and  N2  should  not  be  indepen¬ 
dent,  and  this  assumption  is  expected  to  hold  in  general  due  to  the  delay  constraint 
A.  The  third  assumption  implies  that  the  chaff  parts  of  Ni  and  N2  are  indepen¬ 
dent,  and  they  are  also  independent  of  the  flow  part.  We  note  here  that  the  third 
assumption  is  more  restrictive  than  that  used  in  earlier  works  [13,14], 

We  employ  the  notion  of  Chernoff  consistency  [68]  to  evaluate  the  asymptotic 
performance  of  detectors. 

Definition  2.2.3  For  j  =  0,  1,  Vj  denotes  the  set  of  all  possible  distributions  of 
(Sj)!=1  under  FLj.  A  detector  S ((sj)?=1,  t)  is  a  function  of  the  epochs  of  (sj)f=1  in 
[0,  t],  which  is  equal  to  j  if  the  decision  is  FLj.  <5’((sj)?=1,  t)  is  said  to  be  consistent 

if 

1-  V  Q0e  V0,  lim  <3o(<5((S;)i=i,  t)  =  !)  =  0,  and 

£—>•00 

2.  v  Q,  e  Vu  lim  g1(<5((SJ)-=n  t)  =  0)  =  0. 

£—>•00 
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In  other  words,  a  detector  is  consistent  if  its  false  alarm  and  miss  detection 
probabilities  vanish  as  t  grows,  under  all  possible  distributions  in  Vq  and  V\.  In 
the  following  sections,  we  will  reduce  Vo  and  V\  to  the  sets  of  distributions  sat¬ 
isfying  certain  additional  conditions,  and  prove  the  consistency  of  our  detection 
algorithms. 

Intuitively,  the  greater  the  amount  of  chaff  epochs,  the  harder  the  flow  detection 
becomes.  To  measure  the  relative  strength  of  the  flow  part  with  respect  to  the  chaff 
part,  we  introduce  the  following  definition  of  flow  fraction. 

Definition  2.2.4  Under  V i,  suppose  that  (Sj)^=1  consists  of  the  bidirectional  flow 
( F j ) ?=  i  and  the  chaff  part  (W,:).=1.  Given  a  realization  (Sj)?=1,  where  s*  =  f)  © 

Wj,  i  —  1,2,  the  flow  fraction  of  (sj)f=1  is  defined  as 

2 

R  (t)  =  ^ - ,  R  =  liminf  R(t)  (2.2) 

2  t— >oo 

E  Is* n  [0, t]| 

i= 1 

where  |  T*  D  [0,  t]|  is  the  number  of  flow  packet  transmissions  at  node  Ni  in  [0,  t\, 
and  |Sj  D  [0,  t\\  is  the  number  of  total  transmissions  at  node  Nt  in  [0,  t\. 

In  other  words,  R (t)  is  the  fraction  of  the  flow  epochs  in  the  measurements  up 
to  time  t,  and  R  is  its  limiting  value. 


2.3  Parametric  Flow  Detection 

We  begin  with  an  easier  case  where  an  accurate  parametric  model  for  traffic  is 
available.  The  main  result  in  this  section  is  a  simple  algorithm  that  computes,  for 
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measurements  (sj)2=1,  the  maximum  schedulable  flow  fraction  (R)  as  our  decision 
statistic.  The  flow  detection  algorithm  is  a  threshold  decision  rule  based  on  R. 
The  computation  of  the  threshold,  however,  requires  the  knowledge  of  the  traffic 
distribution  under  Hq,  which  we  assume  available  at  the  moment;  this  assumption 
is  removed  in  Section  2.4. 


2.3.1  Decision  Statistic:  Maximum  Schedulable  Flow  Frac¬ 
tion 

Under  both  hypotheses,  given  a  realization  (s;)2=1  in  [0,  t\,  its  maximum  schedulable 
flow  fraction  R (t)  is  defined  as 

2 

R(£)  =  max  1 - 

{(fi.Wi)?.,  :  =  ^|Sjn[0ji]| 

i—  1 

where  s*  =  fj  ©  w;  ~  Hi  denotes  the  constraint  that  s*  =  fj  ©  Wj,  i  —  1,  2, 
and  (fi,  f2)  is  a  realization2  of  a  bidirectional  flow.  In  other  words,  we  schedule  a 
maximum  number  of  bidirectional  flow  transmissions  between  Sr  and  s2  in  [0,  £], 
and  denote  the  fraction  of  the  flow  part  by  R (t). 

To  effectively  evaluate  R (t),  we  propose  a  matching  algorithm  called 
Bidirectional- Bounded-Greedy- Match  (BiBGM).  To  achieve  its  goal,  BiBGM  starts 
with  the  first  epoch  in  U  S2,  and  subsequently  finds  the  earliest  one-to-one 
matches  satisfying  causality  and  the  delay  constraint.  We  explain  below  the  op¬ 
eration  of  BiBGM  using  an  example  in  Fig.  2.3  accompanied  by  a  pseudocode 

2In  other  words,  f,  can  be  partitioned  into  two  subsequences  f?2  and  f21  such  that  there 
exist  bijections  g\  :  Jf2  and  g2  :  — >  T21  satisfying  gi(s)  —  s  €  [0,  A],  Vs  €  'Jfl  and 

g2(s)  -  s  G  [0,  A],  Vs  e  Sf-. 
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Table  2.1:  Bidirectional-Bounded-Greedy-Match 


BiBGM(si,  s2,  A): 

1 

m  =  n  =  1; 

2 

while  m  <  Si  and  n  <  S2 

3 

if  S2(n)  <  si(m)  —  A 

4 

s2{n)  is  chaff;  n  <—  n  +  1; 

5 

else  if  s2{n)  >  si(m)  +  A 

6 

si(m)  is  chaff;  m  4—  m  +  1; 

7 

else 

8 

match  si(m)  with  s2(n)‘,  m  m  +  1]  n  <—  n  +  1] 

9 

end 

10:  end 

H  ^  {Matched  epochs} 

|Si|  +  |S2| 

implementation  in  Table  2.1: 


1.  At  the  beginning,  all  the  epochs  in  Si  U  S2  are  unmatched.  Start  with  the 
earliest  epoch  in  Si  U  S2,  and  go  to  MATCH  to  find  its  match. 

2.  MATCH:  Let  t  denote  the  epoch  for  which  we  want  to  find  a  match.  For 
i  —  1,  2,  if  t  G  S,,  search  for  the  earliest  unmatched  epoch  in  [t,  t  +  A]  flS(3_;) 
and  match  it  with  £;  if  there  is  no  unmatched  epoch  in  the  interval,  label  t 
as  chaff  (an  epoch  is  said  to  be  checked  if  it  is  either  matched  with  another 
epoch  or  labeled  as  chaff).  Go  to  MOVE. 

3.  MOVE:  If  every  epoch  in  Si  US2  is  checked,  terminate.  Otherwise,  move  to 
the  next  unchecked  epoch  in  Si  U  S2  and  go  to  MATCH  to  find  its  match. 


For  the  example  in  Fig.  2.3,  BiBGM  starts  with  t\.  Since  t\  e  Si,  we  search  for 
the  earliest  unmatched  epoch  in  [t\,  t\  +  A]  fiS2,  which  is  t2.  Hence,  t\  is  matched 
with  t2.  Then,  we  move  to  the  next  unchecked  epoch,  £3  of  Si.  Because  t2  is  the 
only  epoch  in  [£3,  £3  +  A]  flS2  and  it  is  already  matched  with  t\,  we  label  t3  as  chaff. 
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chaff 

f  1  t'3  R  1 


Figure  2.3:  Bidirectional-Bounded-Greedy-Match 

Next,  we  move  to  the  next  unchecked  epoch  (t4  of  §2)  and  searches  for  the  earliest 
unmatched  epoch  in  [t4,  f4  +  A]  D  Si-  BiBGM  continues  until  the  last  epoch  of 
Si  U  §2  is  checked. 

From  Table  2.1,  it  can  be  easily  seen  that  BiBGM  has  linear  computational 
complexity  with  respect  to  the  sample  size  (he.,  the  total  number  of  observed 
epochs).  The  following  theorem  states  that  BiBGM  indeed  achieves  the  optimal 
scheduling  such  that  the  flow  part  is  maximized. 

Theorem  2.3.1  Suppose  we  run  BiBGM  on  (sj)f=1  in  [0,  t] .  Then,  the  fraction 
of  the  matched  epochs  is  equal  to  R (t) . 

Proof :  See  Section  2.6.  ■ 

2.3.2  Parametric  Flow  Detection  under  Poisson  Models 

In  this  section,  we  assume  the  knowledge  of  the  underlying  parametric  model  for 
transmission  processes  and  propose  a  detection  algorithm  called  Bidirectional  Flow 
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Detector  (BFD).  BFD  is  a  threshold  decision  rule  based  on  R (t).  Specifically,  BFD 
with  a  threshold  r  takes  the  following  form: 

{If  R  (t)  >  r,  declare  Hi; 
otherwise,  declare  Ho- 

If  (S j)?=1  contains  a  bidirectional  flow,  R (t)  is,  by  definition,  an  upper  bound  on 
R(f)  and  will  tend  to  be  greater  compared  to  the  case  that  (S j)2=1  is  an  independent 
pair;  this  is  the  intuition  behind  declaring  H±  when  R(f)  is  greater  than  r. 

Under  Hi,  since  R (t)  >  R (t),  BFD  with  r  can  detect  any  flow  with  R(f)  >  r, 
and  a  smaller  r  makes  BFD  capable  of  detecting  a  larger  set  of  flows.  However,  a 
smaller  r  results  in  a  higher  false  alarm  probability.  Hence,  there  exists  a  trade¬ 
off  between  the  detection  ability  of  BFD  and  its  false  alarm  probability,  and  we 
need  to  consult  the  parametric  model  for  (Sj)2=1  under  Ho  to  find  out  how  small 
r  should  be.  Specifically,  if  under  Ho,  as  t  increases  R(t)  converges  to  or  stays 
close  to  a  certain  constant  To  with  high  probability,  we  can  set  r  slightly  greater 
than  To  and  make  the  false  alarm  probability  become  negligible  as  t  grows.  For 
homogeneous  Poisson  traffic,  the  following  convergence  result  gives  a  guidance  for 
setting  r. 


Theorem  2.3.2  Under  Ho,  if  Si  and  S2  are  homogeneous  Poisson  processes  with 
rates  Ai  and  A2  respectively,  then  as  t  grows  to  infinity,  R (t)  converges  almost 
surely  (a.s.)  to 


' 


2AiA2(1  -  e2A(Al“A2)) 

(Ai  +  A2)(A2^Aie2A(Ai-Ab) 

2AA 
1  +  2AA 


if 

if  Ai  =  A2 


A. 


Proof:  See  Section  2.6. 
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Especially,  if  (S,)?=1  under  TLq  and  (Wj)-=1  under  PL\  are  homogeneous  Poisson 
processes,  the  following  theorem  states  that  any  bidirectional  flow  with  a  positive 
rate  is  detectable  regardless  of  the  amount  of  chaff  epochs. 


Theorem  2.3.3  Suppose  that  (i)  under  PLq,  Si  and  S2  are  homogeneous  Poisson 
processes,  (ii)  under  PL\,  Wi  and  W2  are  homogeneous  Poisson  processes,  and 
(in)  under  both  hypotheses,  the  rates 3  of  Si  and  S2  are  \\  and  X2,  respectively. 
Then,  for  any  rj  e  (0,  1),  there  exists  a  proper  threshold  t,  such  that  BFD  with 
t  can  consistently  detect 4  any  bidirectional  flow  with  R  >  rj  a.s.,  with  the  false 
alarm  probability  decaying  exponentially  fast  as  the  sample  size  grows.  Especially, 
for  rj  e  ^0,  2'rT|{+y2})  ’  following  r  can  be  used: 


( 


2Ai —  2A2 


Ai(4  —  rj)  —  c2A(\i-\?) 

XflA-g)  -  Aip 


(A2  +  Ai) 


Ai(4-ry)-A2rycaAfAi_^ 
A2(4  -rj)  -  \±ri 


rj  +  2A(2  —  rj)  A 
2  +  2A(2  —  r])A 


fl  Ai  A2; 


z/ Ai  =  A2  =  A. 


Proof:  See  Section  2.6. 


It  can  be  shown  that  the  suggested  r  in  Theorem  2.3.3  is  a  strictly  increasing 
function  of  77,  and  as  g  decreases  to  0,  it  decreases  to  0(Ai,a2)  in  Theorem  2.3.2. 
This  means  that  to  detect  flows  with  smaller  flow  fraction,  r  should  be  closer  to 
the  Hindoo  R (t)  value  under  Pi0. 

Instead  of  the  knowledge  of  the  parametric  model,  training  data  can  also  be 


3By  rates,  we  mean  that  lim  — - —  =  A i  a.s.,  where  Ndt)  denotes  the  number  of  epochs  of 

t— >00  t 

S i  in  [0,  t}. 

4In  other  words,  if  the  distributions  of  (S i)|=1  under  Hq  and  Tp  satisfy  (i),  (ii),  (iii),  and 
R  >  y  a.s.,  then  under  all  those  distributions,  the  false  alarm  and  miss  detection  probability 
vanish  as  t  increases  (as  in  Definition  2.2.3). 
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used  to  set  r.  If  a  large  set  of  different  realizations  of  "H0  traffic  is  available,  we  can 
run  BiBGM  over  each  realization  in  the  training  data  set,  estimate  the  statistical 
behavior  of  R (£)  under  "Ho,  and  set  r  such  that  the  probability  that  R (t)  >  r  under 
"Ho  (he.,  false  alarm  probability)  becomes  reasonably  small  as  t  grows.  However, 
if  neither  a  parametric  model  nor  training  data  is  available,  it  is  non-trivial  how 
to  determine  an  appropriate  r;  this  is  the  case  in  many  practical  applications. 


2.4  Nonparametric  Flow  Detection 

In  this  section,  we  assume  that  neither  a  parametric  model  nor  a  training  data  set 
is  available,  and  present  a  novel  nonparametric  flow  detector. 


2.4.1  Algorithm  Structure 


We  begin  by  introducing  the  structure  and  the  main  intuition  of  our  detection  al¬ 
gorithm.  Fig.  2.4  is  describing  its  structure.  A  key  component  of  our  algorithm  is 
a  transformation  of  measurements  (sj)?=1,  which  we  refer  to  as  Independent  Traf¬ 
fic  Approximation  (ITA).  As  the  name  suggests,  ITA  produces  an  approximately 
independent  pair  of  transmission  processes  (Sj)?=1  such  that  s,  has  similar  traffic 
characteristics  (e.g.,  normalized  intensity5,  interarrival  distribution)  with  s*.  After 
(sj)^=1  is  generated,  we  compare  the  statistical  characteristics  of  (sj)^=1  and  (s j)f=1. 
If  the  true  hypothesis  is  "Ho,  both  (sj)f=1  and  (sj)f=1  are  independent  pairs  with 


5The  normalized  intensity  of  Sj  for  the  [0,  t]  interval  represents  the  overall  trend  of  its  intensity 
change  in  [0,  t\.  Suppose  that  the  local  intensity  of  S *  is  well  defined  in  [0,  t\:  i.e.,  X(x)  = 
E{ NAx,  x  +  (5)1 

lim  - - — -  exists  for  all  x  S  [0,  tl,  where  NAa.  b)  denotes  the  number  of  epochs  in 

<5->o+  6 

[a,  b).  The  normalized  intensity  A ^  of  Sj  for  [0,  t\  is  defined  as  the  time-scaled  version  of  the 
intensity  function:  X^\x)  =  X (tx),  x  €  [0,  1]. 
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Figure  2.4:  The  structure  of  the  nonparametric  detection  algorithm. 

similar  traffic  characteristics.  On  the  other  hand,  if  "Hi  is  true,  (sj)f=1  and  (Sj)|=1 
have  similar  traffic  characteristics,  but  (sj)f=1  is  a  correlated  pair  containing  a  flow 
while  (sj)^=1  approximates  an  independent  pair.  Thus,  we  attempt  to  infer  the  true 
hypothesis  by  exploiting  the  gap  between  the  statistical  characteristics  of  (s,)f=1 
and  (sj)f=1:  the  larger  the  gap,  the  more  probable  V.\  is. 

2.4.2  Nonparametric  Bidirectional  Flow  Detector 

This  section  presents  our  detection  algorithm,  referred  to  as  Nonparametric  Bidi¬ 
rectional  Flow  Detector  (NBFD).  Here,  we  simply  assume  that  ITA  generates  an 
output  (Sj)f=1  with  desired  properties:  (i)  (sj)f=1  approximates  an  independent 
pair  of  transmission  processes,  and  (ii)  its  normalized  intensities  and  interarrival 
distributions  resemble  that  of  (sj)?=1.  The  detail  about  ITA  is  delayed  to  the  next 
section,  and  here  we  focus  on  the  operation  of  NBFD. 

As  described  in  Fig.  2.4,  NBFD  observes  (Sj)?=1  in  [0,  t]  and  first  runs  ITA 
to  generate  (sj)?=1.  The  next  step  is  to  compare  the  statistical  characteristics  of 
(sj)f=1  and  (si)?=1.  It  was  shown  in  Theorem  2.3.3,  although  stated  under  the 
homogeneous  Poisson  traffic  assumption,  that  the  maximum  schedulable  flow  frac¬ 
tion  R (t)  can  be  effectively  used  to  distinguish  whether  the  measurements  are 
from  a  flow-containing  pair  or  an  independent  pair.  Moreover,  R (t)  can  be  easily 
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evaluated  by  running  BiBGM;  hence,  NBFD  employs  R(t).  NBFD  runs  BiBGM 
separately  on  (si)f=1  and  (sj)f=1  and  compares  the  fractions  of  the  matched  epochs 
in  the  two  cases,  denoted  by  R(t)  and  f(t)  respectively.  If  the  true  hypothesis  is 
"Ho,  both  (SOLi  and  (Sj)^_i  are  independent  pairs,  and  they  have  similar  normal¬ 
ized  intensities  and  interarrival  distributions;  this  implies  that  R (t)  and  f[t)  are 
expected  to  be  close  under  'Hq.  On  the  other  hand,  when  "Hi  is  true,  (Sj)?=1  and 
(Sj)?=1  have  similar  normalized  intensity  functions  and  interarrival  distributions, 
but  (Sj)f=1  contains  a  flow  while  (Sj)?=1  approximates  an  independent  pair;  hence, 
R (t)  is  expected  to  be  greater  than  f(t).  Based  on  the  above  intuition,  given  (sj)?=1 
in  [0,  t],  NBFD  with  e  works  as  follows: 

1.  Run  ITA  on  (sj)?=1  in  [0,  t]  to  generate  (s j)f=1. 

2.  Run  BiBGM  on  (sj)f=1  and  (sj)f=1:  R (t)  and  f(t)  denote  the  fractions  of  the 
matched  epochs  for  (s*)?=1  and  (sj)?=1  respectively. 

3.  If  R (t)  >  f(t)  +  6,  declare  "Hi;  otherwise,  declare  No¬ 
where  e  is  a  positive  number  added  to  f(t)  to  allow  small  difference  between  R (t) 
and  f{t)  under  "Ho-  f(t)  can  also  be  seen  as  an  estimate  of  what  R(f)  would  be 
under  T-L0.  Therefore,  recalling  the  discussion  of  setting  r  of  BFD  in  Section  2.3.2, 
NBFD  can  be  alternatively  interpreted  as  BFD  with  a  measurement-dependent 
threshold  f(t)  +  e. 

It  is  evident  from  the  form  of  NBFD  that  a  smaller  e  will  lead  to  the  decrease 
in  the  miss  detection  probability.  However,  the  decrease  in  e  will  increase  the  false 
alarm  probability.  Because  of  the  trade-off  associated  with  the  choice  of  e  and  the 
nonparametric  characteristic  of  our  problem,  it  is  difficult  to  claim  that  certain 
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Si 
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Figure  2.5:  ITA  samples  tu-second  intervals  {Ai,  A2, . . .}  and  {Bi,  B2, . . .}  from 
(sj)f=1  and  assemble  them  to  generate  (sj)f=1. 

e  value  is  the  best  choice.  The  experimental  results  in  Section  2.5  suggest  that 
setting  e  ~  0.05  generally  results  in  satisfactory  performance. 


2.4.3  Independent  Traffic  Approximation 

In  this  section,  we  present  how  ITA  approximates  an  independent  pair  of  transmis¬ 
sion  processes  that  has  the  similar  normalized  intensity  and  interarrival  distribution 
with  (SAW 

Fig.  2.5  is  illustrating  the  operation  of  ITA.  ITA  has  two  parameters:  the  sam¬ 
pling  window  width  w  and  the  gap  a  (a  >  A)  between  neighboring  sampling 
windows.  As  described  in  Fig.  2.5,  ITA  samples  the  epochs  in  the  w-second  win¬ 
dows  separated  by  a-second  gaps,  shifts  them  properly,  and  assembles  them  to 
approximate  independent  traffic.  The  intuition  behind  ITA  is  that  if  the  gap  a 
between  two  sampling  windows  is  sufficiently  large,  the  epochs  in  different  win¬ 
dows  will  tend  to  be  approximately  uncorrelated.  Note  that  when  (Sj)f=1  contains 
a  bidirectional  flow,  ITA  disassembles  the  flow  part  and  significantly  reduces  the 
flow-induced  correlation.  In  addition,  since  we  use  a  sequence  of  sampled  intervals 
of  Si  for  generating  S,;,  S,  and  Sj  are  expected  to  share  some  common  character¬ 
istics. 
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Figure  2.6:  Si  and  S2  are  non-homogeneous  Poisson  processes,  and  Ai(x)  and  A2(x) 
denote  their  local  intensities  at  time  x  respectively.  A^a;)  and  A2(x)  can  only  take 
values  from  {/i  i,  /i2}  {/ii  ^  /i2).  The  hgure  describes  the  intensity  change  of  Si,  S2, 
Si,  and  S2  using  two  types  of  bars.  The  bars  filled  with  slant  lines  represent  the 
intervals  in  which  A*(x)  =  // 1 ,  and  the  blue  bars  represent  the  intervals  in  which 
A i(x)  =  /i2.  The  numbers  above  or  below  the  intervals  describe  the  correspondence 
between  the  sampled  intervals  in  Sj  and  the  intervals  in  Sj. 

To  illustrate  how  the  normalized  intensities  of  S^  and  are  related,  Fig.  2.6 
describes  the  intensity  change  of  (Si)^=1  and  (Sj)^=1  for  an  example  where  Si  and 
S2  are  non-homogeneous  Poisson  processes  with  two  possible  intensity  levels.  As 
observed  in  Fig.  2.6,  if  the  average  time  that  the  intensity  of  S,  stays  in  one  level 
is  much  longer  than  2{w  +  a)  seconds,  the  normalized  intensity  function  of  S j  is 
similar  to  that  of  S,.  About  interarrival  distribution,  if  w  is  sufficiently  large  so 
that  a  ic-second  sampling  window  is  likely  to  contain  a  large  number  of  points,  the 
interarrival  distribution  of  S*  will  resemble  that  of  Sj.  Moreover,  if  the  interarrival 
distribution  of  S i  varies  slowly  over  time,  as  in  the  example  of  Fig.  2.6,  the  inter¬ 
arrival  distribution  of  S*  will  also  change  over  time  with  the  similar  trend,  even 
though  the  time  scale  is  different  due  to  the  sampling  procedure  of  1TA.  Note  that 
resampling  from  the  empirical  interarrival  distributions  (ie.,  generating  i.i.d.  in- 
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terarrival  times  of  Sj  from  the  empirical  interarrival  distribution  of  Sj,  for  i  =  1,2) 
can  also  produce  an  independent  pair  of  point  processes.  However,  unlike  (Si)f=1  of 
1TA,  when  (Sj)?=1  is  non-stationary,  the  results  of  such  resampling  approaches  may 
have  a  totally  different  dynamics  from  (S j)?=1;  they  may  not  capture  the  patterns 
of  intensity  change  or  interarrival  distribution  change  in  (Sj)f=1. 

Now,  we  will  check  whether  (Sj)?=1  can  approximate  an  independent  pair. 
When  Si  and  S2  are  independent,  it  directly  follows  that  Si  and  S2  are  inde¬ 
pendent.  On  the  other  hand,  if  (Sj)j=1  contains  a  bidirectional  flow,  Si  and  S2 
are  not  necessarily  independent.  However,  assuming  that  correlation  across  time 
is  weak  and  the  gap  a  is  much  larger  than  A,  the  epochs  in  different  windows 
are  expected  to  be  approximately  uncorrelated:  i.e.,  in  Fig.  2.5,  the  epochs  in 
each  Ai  will  be  approximately  uncorrelated  with  the  epochs  in  (Jj>i  Bj.  This  im¬ 
plies  that  when  temporal  correlation  is  weak,  (Si,  S2)  is  expected  to  approximate 
an  independent  pair.  The  following  example  illustrates  a  case  where  (Sj)?=1  has 
weak  temporal  correlation.  Suppose  Si  is  a  Poisson  process  and  S2  is  such  that 
S2(i)  =  Si(i)  +  Di ,  Vz,  where  D,t s  are  independent  random  delays  bounded  by  A 
a.s.:  i.e.,  (Sj)f=1  is  a  unidirectional  flow  with  a  delay  constraint  A.  The  memory¬ 
less  property  of  Poisson  processes  implies  that  epochs  in  an  interval  are  correlated 
with  epochs  in  another  disjoint  interval  only  if  the  gap  between  the  two  intervals 
is  less  than  A  seconds.  Hence,  if  a  >  A,  epochs  in  different  sampling  windows  of 
1TA  are  independent,  implying  that  (S*)f=1  is  an  independent  pair. 

Under  Ho,  NBFD  requires  (Sj)f=1  to  be  an  independent  pair  having  the  similar 
traffic  characteristics  with  (Sj)?=1,  because  R (t)  and  f{t)  have  to  be  close  under 
Ti0.  However,  under  H\,  NBFD  does  not  necessitate  the  independence  of  Si  and 
S2,  even  though  the  independent  case  is  ideal.  Under  Hi,  NBFD  wants  f(t)  to 
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be  less  than  R (t),  and  this  can  be  achieved  by  making  (Sj)^=1  very  unlikely  to 
contain  a  flow.  Because,  as  can  be  inferred  from  the  discussion  in  Section  2.3.2, 
the  maximum  schedulable  flow  fraction  ( e.g .,  R (t)  and  f(t )  of  NBFD)  tends  to 
be  higher  when  the  measurements  come  from  a  flow-containing  pair.  Note  that 
ITA  does  make  (Sj)f=1  unlikely  to  contain  a  flow  by  tearing  apart  the  flow  part  of 
(Sj)?=1  in  its  sampling  procedure. 

Given  the  measurements  (sj)?=1  in  [0,  t],  ITA  with  (w,  a)  generates  (s j)f=1  as 
follows: 

1.  Initially,  Sx  and  s2  contain  no  epoch. 

2.  For  i  =  0,  1,...,  L^fedJ  “1: 

(a)  Take  the  epochs  of  Si  in  [2 i(w  +  a),  2 i(w  +  a)  +  w],  subtract  i(w  +  2a) 
from  the  epochs,  and  add  them  to  Sj. 

(b)  Take  the  epochs  of  s2  in  [(2i  +  l){w  +  a),  (2i  +  l)(w  +  a)  +  w],  subtract 
i(w  +  2a)  +  (w  +  a)  from  the  epochs,  and  add  them  to  s2. 

The  implementation  of  ITA  is  given  in  Table  2.2.  As  can  be  seen  from  Table  2.2, 
ITA  has  linear  computational  complexity  with  respect  to  the  sample  size. 

One  drawback  of  ITA  is  that  it  throws  away  more  than  a  half  of  the  mea¬ 
surements  during  the  sampling  procedure,  thereby  restricting  the  sample  size  of 
(sj)f=1  to  be  at  most  a  half  of  that  of  (sj)f=1.  (sj)f=1,  together  with  (sj)f=1,  is  used 
to  calculate  the  decision  statistic  of  NBFD,  so  a  large  sample  size  is  desirable. 
Therefore,  we  suggest  a  modification  of  ITA,  referred  to  as  ITA-double  (ITAd),  to 
double  the  sample  size  of  (si)?=1. 
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Table  2.2:  Independent  Traffic  Approximation 


ITA(si,  s2,  t,  w,  a): 


1: 

2: 

3: 

4: 

5: 

6: 

7: 

8: 

9: 

10 

11 

12 

13 

14 

15 

16 

17 

18 
19 


si  <—  ();  s2  <—  ();  ai  <—  ();  a2  <—  ();  j  —  1;  k  —  1; 
fo”  =  0  :  1  :  ~  1 
while  si(j)  <  2i(w  +  a) 

j  <-  j  +  i; 

end 

while  Si(j)  <  2i(u>  +  a)  +  w 
ai  ai  ©  si(j);  J  ■<—  j  +  1; 
end 

while  s2(/c)  <  (2i  +  l)(w  +  a) 
k  i —  k  ©  1; 
end 

while  s2(k)  <  (2i  +  l)(w  +  a)  +  w 
d2  i —  a,2  ©  s2(k);  k  i —  k  +  1; 
end 

CLl  i —  d\  —  i(w  ©  2 Si  i —  Si  ©  d\] 

d2  i —  a2  —  {i(w  +  2a)  +  w  +  a);  s2  4 —  s2  ©  a2; 
ai  ©■  ();  a2  4—  (); 
end 

return  (si)f=i- 


*  For  a  sequence  (aq);> i  and  a  real  number  r,  (xj)j> i  —  r  =  (2/*)*>i  where 
yi  =  Xi-  r,  Mi. 


Figure  2.7:  The  sample  size  of  (sj)f=1  doubles  compared  to  ITA.  Unlike  ITA,  ITAd 
does  not  throw  away  {A2,  A4, . . .}  or  {B2,  B 4, . . it  assembles  all  of  {Ai,  A2, . . .} 
and  {Bi,  B2, . . .}  to  generate  (sj)?=1. 

The  operation  of  ITAd  is  illustrated  in  Fig.  2.7.  In  ITAd,  when  Si  and  S2  are 
independent,  so  are  Si  and  S2.  However,  if  (S j)?=1  contains  a  flow,  (Sj)?=1  is  not 
an  independent  pair,  because  the  epochs  in  Ai+i  and  those  in  B%  are  correlated 
due  to  the  presence  of  the  flow.  However,  (S;)^  is  a  concatenation  of  w-second 
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intervals,  where,  in  each  interval,  the  epochs  of  Si  and  S2  are  approximately  uncor¬ 
related.  We  believe  that  this  property  is  enough  for  NBFD  to  sense  the  difference 
in  statistical  characteristics  between  (Sj)f=1  and  (S*)f=1  under  "Hi,  especially  when 
w  is  large.  Although  we  have  no  analytical  proof  for  the  superiority  of  ITAd  over 
ITA,  the  use  of  ITAd  in  NBFD  instead  of  ITA  consistently  resulted  in  a  better 
performance  in  all  our  simulations  and  experiments  in  Section  2.5. 

2.4.4  Performance  Analysis 

This  section  provides  the  analysis  of  algorithmic  efficiency  and  consistency  of 
NBFD. 

NBFD  is  efficient  in  terms  of  computation  and  memory  requirement.  Because 
its  main  components,  ITA  and  BiBGM,  have  linear  complexity,  NBFD  also  has 
linear  computational  complexity  with  respect  to  the  sample  size.  In  addition, 
assuming  that  NBFD  with  (w,  a,  e)  is  executed  in  real-time  over  transmission 
processes  of  two  nodes,  it  only  requires  to  save  the  most  recent  BiBGM  matches  of 
(si)i=i  and  (s*)i=i  and  the  timing  measurements  in  the  most  recent  2(tc  +  Q!)-second 
interval;  they  arc  all  the  information  needed  to  continue  running  ITA  and  BiBGM 
over  the  future  timing  measurements. 

For  a  class  of  non-homogeneous  Poisson  traffic,  NBFD  has  a  consistency  prop¬ 
erty  as  stated  in  the  following  theorem. 


Theorem  2.4.1  Assume  that  w  and  a  are  any  positive  numbers  with  a  >  A. 
For  any  rj  G  (0, 1),  there  exists  an  e  G  (0, 1)  such  that,  for  any  e  G  (0,  e],  NBFD 
with  (w,  a,  e)  consistently  detects  any  bidirectional  flow  with  R  >  77  a.s.,  if  the 
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distributions  of  (Sj)2=1  under  Hq  and 'Hi  satisfy  the  following  assumptions6: 


1.  Under  both  hypotheses,  Si  and  S2  are  non-homogeneous  Poisson  processes. 
In  addition,  under  Hi,  Si  =  (F-2  ®  F21)  ®  Wr  F22,  F21,  W®  and  W2 
are  independent  non-homogeneous  Poisson  processes,  F22  is7  sort{F22(z)  + 
a®  i  >  1},  and  F21  is  sortjF^i)  +  fy,  i  >  1}  where  {a®  i  >  1}  and 
{(3i,  i  >  1}  are  random  variables  satisfying  a®  /3i  G  [0,  A]  almost  surely. 
Furthermore,  {a®  i  >  1}  X  W1;  {/3^ ,  i  >  1}  X  W2,  and?  X  {a®  i  > 
1},  {Pi,i>  1},  F22.  F22. 

2.  Let  Xi(t),  X  2(f),  Xfi(t),  and  X /2(t)  denote  the  local  intensities  of  Si,  S2,  Fj2, 

and  Fj1  respectively.  There  exist  two  finite  sets  A0  =  =  {p\  \  PP),  1  < 

j  A  Mo}  and  Ai  =  {A^)  A  (A^,  xP,  A^,  A^),  1  <  k  <  M{\  with  pp  > 
0,  Xp}  >  0,  %  —  1,  2,  Vj,  fc.  Under  Ho,  ( Ai (t) ,  A 2(t))  can  on/y  tofce  values  in 
A0.  Under  Hi,  X (t)  =  (Ai (t),  X 2(t),  A/iW,  A/2  (£) )  can  on/y  tofce  values  in 
Ai. 


5.  Under  Hq,  if  c(t)  denotes  the  number  of  times  that  (Ai(t),  A 2(t))  changes  its 

c(t) 

value  in  [0,  t\,  then  lim  -  =  0.  Similarly,  under  Hi,  if  c(t)  denotes  the 


t— loo  t 


number  of  times  that  X (t)  changes  its  value  in  [0,  t\,  then  lim 


c(t ) 


t— loo  t 


=  0. 


4-  Under  Ho,  if  Pk(f)  (1  <  k  <  M0 )  denotes  the  fraction  of  the  time  in  [0,  t] 
that  ( Ai (t) ,  A 2(t))  =  pk\  then  ast  increases,  each  pk(t)  converges.  Similarly, 
under  Hi,  if  ppt )  (1  <  k  <  Mi)  denotes  the  fraction  of  the  time  in  [0,  t ] 
that  X{t)  =  Hk\  then  as  t  increases,  each  pk(t )  converges. 

6In  other  words,  if  the  distributions  of  (S i)2=1  under  Ho  and  Hi  satisfy  the  listed  assumptions 
(including  R  >  g  a.s.  under  Hi),  then  under  all  those  distributions,  the  false  alarm  and  miss 
detection  probabilities  of  NBFD  with  (w,  a,  e)  vanish  as  t  grows  (as  in  Definition  2.2.3). 

7For  a  countable  set  A  of  real  numbers,  sort{A}  is  the  sequence  of  the  elements  of  A  ordered 
in  the  increasing  order. 

sFor  random  processes  A,s,  Ai  X  A2  means  Ax  and  A2  are  independent,  and  X  Al5 . . . ,  A.n 
means  Ar,...,  A„  are  independent. 
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Proof :  See  Section  2.6.  ■ 

The  first  assumption  means  that  under  Hi,  (S j)f=1  is  a  superposition  of  three 
independent  parts:  the  unidirectional  flow  from  Si  to  S2,  the  unidirectional  flow 
from  S2  to  Si,  and  the  chaff  parts,  a*  and  /?*  represent  packet  delays  of  the  two 
unidirectional  flows,  and  they  satisfy  certain  independence  relationships  involving 
the  flow  parts  and  the  chaff  parts.  The  first  assumption  is  sufficient  to  guarantee 
that  the  output  of  ITA,  (Sj)f=1,  is  an  independent  pair  under  H\.  The  second 
assumption  implies  that  the  local  intensities  of  the  total  traffic  and  flows  can  only 
take  a  finite  number  of  different  values.  The  third  assumption  says  that  the  number 
of  intensity  changes  in  [0,  t]  grows  as  o(t).  Finally,  the  last  assumption  means  that 
the  fraction  of  the  time  that  the  intensity  vector  assumes  a  specific  value  converges 
as  the  observation  time  increases.  Under  these  assumptions,  Theorem  2.4.1  states 
that  a  bidirectional  flow  with  any  positive  rate  can  be  consistently  detected  by 
NBFD  if  e  is  properly  set.  Note  that  the  assumptions  do  not  restrict  traffic  to  be 
stationary. 

As  pointed  out  by  Paxson  and  Floyd  [17],  a  Poisson  process  is  not  always  a 
good  model  for  network  arrival  processes.  Several  network  traces  (e.g.,  Ethernet 
and  World  Wide  Web  traffic)  have  been  experimentally  proved  to  display  self¬ 
similarity  [69-71],  which  Poisson  processes  do  not  show.  To  test  the  performance 
of  NBFD  over  non-Poisson  traffic,  we  will  evaluate  NBFD  in  the  following  section 
using  LBL  TCP  traces,  which  were  used  in  [17]  to  invalidate  Poisson  modeling, 
and  real-world  measurements  from  MSN  VoIP  sessions. 
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2.5  Numerical  Results 

NBFD  was  tested  using  the  synthetic  Poisson  traffic,  LBL  TCP  traces,  and  the 
real-world  measurements  from  MSN  VoIP  sessions.  Comparison  with  other  passive 
flow  detectors  is  also  provided:  the  wavelet  analysis  in  [1],  Detect- Attack-Chaff 
(DAC)  in  [13],  and  the  random  projection  method  in  [16]. 

The  wavelet  analysis  [1]  calculates  the  wavelet  coefficients  of  Ni(t)  and  N2(t) 
using  the  mother  Haar  wavelet  with  a  sufficiently  large  scale,  where  IVj(i)  is  the 
number  of  epochs  of  S*  in  [0,  t] .  Then,  it  calculates  the  Peason’s  correlation  coeffi¬ 
cient  between  the  wavelet  coefficients  of  Ni(t)  and  that  of  N2(t),  and  declares  H\  if 
the  correlation  coefficient  is  greater  than  a  predetermined  threshold  re;  otherwise, 
it  declares  Hq.  The  intuition  of  the  algorithm  is  based  on  their  analysis  under 
the  Poisson  traffic  assumption:  the  correlation  coefficient  converges  to  a  positive 
constant  as  the  scale9  grows  to  infinity  if  (Sj)9=1  contains  a  flow. 

DAC  [13]  is  based  on  the  intuition  that  as  t  increases  |Ar1(t)  —  N2(t)\  tends 
to  grow  large  when  Si  and  S2  are  independent,  whereas  it  tends  to  stay  small  if 
(Sj)f=1  contains  a  flow  with  a  much  higher  rate  than  the  chaff  part.  DAC  with  a 
parameter10  pa  monitors  |Ah(t)  —  N2(t)\.  At  every  8(pa  +  1)2  packet  transmissions, 
both  All  and  N2  are  set  to  be  zero  and  new  counting  begins.  It  declares  Ho  if 
\Ni(t)  —  N2(t)\  grows  larger  than  a  threshold  2 pA.  If  \Ni(t)  —  N2(t)\  stays  less 
than  2pA  during  the  whole  observation  duration,  DAC  declares  Hi.  Note  that 

9  Since  the  wavelet  analysis  relies  on  the  convergence  of  the  correlation  coefficient  as  the  scale 
grows,  a  large  scale  is  desired.  However,  given  a  fixed  observation  duration,  using  too  large  scale 
can  cause  the  sample  size  of  the  correlation  coefficient  estimation  (i.e.,  the  number  of  wavelet 
coefficients)  to  be  very  small.  To  prevent  this,  in  our  experiments,  the  sample  size  is  fixed  to  be 
100,  and  the  scale  is  set  to  be  (the  observation  duration)/100. 

10In  [13],  pa  is  defined  to  be  a  uniform  upper  bound  on  the  number  of  epochs  of  a  node  (Si 
or  S2)  in  any  A-sec  interval.  However,  none  of  our  test  traces  guarantees  such  a  uniform  upper 
bound.  Hence,  we  tried  DAC  with  various  pa  values,  which  include  large  enough  numbers  to 
bound  the  number  of  epochs  in  any  A-sec  interval  with  high  probability. 
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under  Tii,  if  bursty  chaff  transmissions  occur  in  either  node,  |A^(t)  —  iV2(t)|  may 
suddenly  grow  larger  than  2p&  thereby  resulting  in  a  miss  detection.  Hence,  DAC 
is  vulnerable  to  bursty  chaff  insertion. 

The  random  projection  method  in  [16],  which  we  denote  by  RP,  is  based  on 
the  idea  of  measuring  the  distance  between  Si  and  S2  after  random  projection.  It 
first  partitions  the  observation  interval  into  the  time  slots  with  length  L?s,  and 
counts  the  number  of  epochs  in  each  time  slot.  The  number  of  epochs  of  S *  in 
the  jth  time  slot  is  denoted  by  Vt (j ) ,  i  =  1,  2,  1  <  j  <  T.  Then,  RP  generates 
a  set  of  K  random  basis  vectors  { B k  e  {  —  1,  1}T,  1  <  k  <  K },  where  each  Bk(j) 
(1  <  j  <  T)  is  either  1  or  —1  with  an  equal  probability11.  After  that,  Vt  is  projected 
on  { Bk ,  1  <  k  <  K}:  C^k )  =  Vi(j)Bk(j ),  i  =  1,  2,  1  <  k  <  K.  Finally,  RP 
obtains  a  K-dimensional  binary  vector  Ctl  where  Ci(k)  =  l{Ci(fc)>o},  referred  to 
as  the  binary  sketch  of  S*.  The  decision  statistic  of  RP  is  the  Hamming  distance 
between  C\  and  C2.  If  the  distance  is  less  than  a  threshold  th,  RP  declares  'H\\ 
otherwise,  "Ho  is  declared. 


2.5.1  Simulation  Results:  Poisson  Traffic  and  LBL  traces 

We  first  performed  Monte  Carlo  simulations  using  the  synthetic  non-homogeneous 
Poisson  traffic.  In  the  simulations,  Si  and  S2  are  Poisson  processes  with  intensity 
functions  Ai(t)  and  A2(t)  respectively.  Under  "Hi,  (Sj)^=1  is  a  superposition  of  two 
independent  parts,  the  unidirectional  flow  (Fj)^=1  and  the  chaff  part  (Wj)?=1.  Fi 
is  a  Poisson  process  with  intensity  function  A /(f),  and  F2  is  generated  by  adding  a 
random  delay  to  each  epoch  of  Fi.  Random  delays  are  independent  and  identically 

11  About  the  parameters  of  RP,  we  used  Lts  =  0.5s,  as  recommended  in  [16].  As  explained 
in  [16],  large  K  is  desired  since  it  will  allow  us  to  extract  more  information  from  S j.  We  used 
K  =  4096,  which  we  believe  is  sufficiently  large  (four  times  the  maximum  K  used  in  [16]). 
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distributed  (i.i.d.)  and  uniformly  distributed  in  [0,  A],  where  A  =  0.1s.  Wj  and 
W2  are  independent  Poisson  processes  with  intensity  functions  \i(t)  —  A f(t)  and 
A2(t)  —  A f(t),  respectively.  In  each  run  of  the  simulation,  (Ai(t),  A 2(t),  A /(t))  is 
piecewise  constant,  and  it  takes  different  values  in  the  first  third,  the  second  third, 
and  the  last  third  of  the  observation  duration.  Specifically,  it  follows  one  of  the 
below  change  scenarios  with  equal  probability: 

1.  (15,  15,  5)  ->•  (15,  15,  12)  ->■  (15,  15,  7). 

2.  (25,  10,  8)  (10,  10,  8)  (10,  25,  8). 

3.  (25,  25,  20)  ->■  (12,  12,  7)  ->■  (8,  8,  3). 

4.  (21,  15,  14)  ->•  (12,  6,  5)  ->•  (12,  24,  5). 

Under  770,  Si  and  S2  are  independent  Poisson  processes,  and  in  each  run  of  the 
simulation,  (A x(t),  A2(t))  follows  one  of  the  above  change  scenarios  (with  no  A f 
part)  with  equal  probability.  In  real  world,  such  changes  in  intensity  may  corre¬ 
spond  to  the  beginning  of  new  sessions,  the  end  of  old  sessions,  the  rate  change  of 
existing  sessions,  and  so  on.  All  change  scenarios  have  the  same  average  rates,  but 
each  scenario  displays  a  different  dynamics.  By  this  simulation  setting,  we  aimed 
at  testing  the  performance  of  detectors  over  the  non- stationary  traffic  displaying 
possibly  a  different  dynamics  at  each  observation  interval. 

Fig.  2.8  shows  the  ROC  curves  of  NBFD  (with  ITAd),  NBFD  (with  ITA),  the 
wavelet  analysis,  DAC,  and  RP.  To  obtain  the  ROC  curves,  we  increased  e  of 
NBFD  and  k  of  the  wavelet  analysis  from  0  to  1  with  an  increment  of  0.01,  pa  of 
DAC  from  4  to  100  with  an  increment  of  2,  and  th  of  RP  from  0  to  K  (K  =  4096) 
with  an  increment  of  1  while  plotting  (Pp,  1  —  Pm)  of  each  case,  where  Pp  and 
Pm  denote  the  false  alarm  probability  and  miss  detection  probability  respectively. 
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Figure  2.8:  ROC  curves  of  NBFD  (ITAd),  NBFD  (ITA),  the  wavelet  analysis, 
DAC,  and  RP  for  different  observation  durations:  NBFD  parameters  are  w  =  2s 
and  a  =  A  =  0.1s,  and  the  number  of  Monte  Carlo  runs  is  10000. 

When  we  further  increased  the  sample  size,  the  ROC  curves  of  NBFD  (ITAd), 
NBFD  (ITA),  and  the  wavelet  analysis  approached  the  upper  left  corner  implying 
that  perfect  detection  is  possible  if  the  thresholds  are  properly  set.  On  the  other 
hand,  DAC  and  RP  resulted  in  non-negligible  error  probabilities  in  every  case,  and 
their  ROC  curves  did  not  improve  much  from  the  curves  in  Fig.  2.8,  even  when  we 
further  increased  the  observation  duration  to  160s.  By  comparing  the  ROC  curves 
of  NBFD  (ITAd)  and  NBFD  (ITA),  we  can  observe  that  ITAd,  the  heuristic  to 
double  the  sample  size  of  (sj)f=1,  resulted  in  a  better  detection  performance  than 
ITA.  In  all  our  simulations  and  experiments,  ITAd  consistently  showed  better 
results  than  ITA.  In  the  rest  of  this  section,  NBFD  is  assumed  to  employ  ITAd 
and  will  be  compared  with  other  detectors. 
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To  test  the  performance  of  detectors  over  non-Poisson  traffic,  we  generated 
synthetic  traffic  based  on  the  TCP  packet  timestamps  in  LBL-PKT-3  (2  hours), 
LBL-PKT-4  (1  hour),  and  LBL-PKT-5  (1  hour)  in  [17].  These  traces  were  mea¬ 
sured  at  the  Lawrence  Berkeley  Laboratory’s  wide-area  Internet  gateway,  and  each 
trace  was  gathered  at  a  different  date  in  January  1994.  For  the  detail,  refer  to  [17]. 
From  each  dataset,  we  extracted  timestamps  of  TCP  packets  that  originated  from 
specific  users,  and  used  them  for  traffic  generation.  For  the  flow  part  of  T~i\  traffic, 
timestamps  of  one  user  in  LBL-PKT-3  were  used  as  Fi,  and  F2  was  generated  by 
adding  a  delay  to  each  epoch  in  F  | .  The  delays  are  i.i.d.  and  uniformly  distributed 
in  [0,  A],  where  A  =  0.1s.  For  the  chaff  part,  timestamps  of  one  user  in  LBL-PKT- 
4  were  used  as  Wi,  and  those  of  one  user  in  LBL-PKT-5  were  used  as  W2.  For  "Ho 
traffic,  Si  is  generated  by  superposing  traces  of  two  users  in  LBL-PKT-4,  and  S2 
is  similarly  generated  with  two  users  in  LBL-PKT-5.  Using  different  sets  of  users 
for  the  traffic  generation,  we  were  able  to  create  the  four-hour  long  test  traffic. 

We  tested  DAC  with  various  pa  ranging  from  10  to  400,  but  its  miss  detection 
probability  was  higher  than  0.38  in  every  case.  This  is  not  surprising  because 
DAC  is  vulnerable  to  bursty  chaff  transmissions  and  LBL  TCP  traces  were  shown 
to  be  bursty  in  [17].  Table  2.3  shows  the  error  probabilities  of  NBFD,  the  wavelet 
analysis,  and  RP.  For  NBFD,  we  used  e  =  0.05.  For  the  wavelet  analysis  and 
RP,  assuming  the  absence  of  a  parametric  model  and  training  data,  we  have  no 
clear  standard  to  set  their  thresholds.  Hence,  we  tried  all  values  from  0  to  1  with 
an  increment  of  0.01  for  k  of  the  wavelet  analysis  and  all  values  from  0  to  4096 
with  an  increment  of  1  for  th  of  RP,  and  found  their  crossover  error  rates  and  the 
corresponding  thresholds,  which  are  listed  in  Table  2.3.  NBFD  and  the  wavelet 
analysis  outperformed  RP,  and  for  long  observation  durations  (160s  and  320s), 
NBFD  performed  better  than  the  wavelet  analysis. 


42 


58 


Table  2.3:  Performance  on  LBL  TCP  traces.  NBFD  parameters  are  w  =  2s, 
a  =  A  =  0.1s,  and  e  =  0.05.  The  numbers  of  experiments  are  180,  90,  and  45 
for  observation  duration  80s,  160s,  and  320s,  respectively.  Under  Ho,  the  average 
traffic  rate  is  (Ai,  A2)  =  (36.4,  36.1).  Under  Hi,  (Ai,  A2)  =  (36.1,36.8).  The 
fraction  of  chaff  in  Hi  traffic  is  0.37. 


NBFD 

Wavelet 

RP 

Time 

Pf 

Pm 

K 

Pf 

Pm 

th 

Pf 

Pm 

80s 

0 

0.100 

0.19 

0.034 

0.056 

762 

0.101 

0.144 

160s 

0 

0.057 

0.20 

0.034 

0.056 

793 

0.112 

0.133 

320s 

0 

0.022 

0.19 

0.023 

0.067 

774 

0 

0.089 

2.5.2  Experimental  Results:  MSN  VoIP  Traffic 

We  tested  the  detectors  using  three-and-a-half-hour  long  real-world  traffic  involving 
the  MSN  VoIP  application12,  which  is  a  representative  example  of  latency-sensitive 
applications.  Fig.  2.9  is  illustrating  the  experimental  setup.  The  laptop  Pi  is 
located  in  the  place  covered  by  the  wireless  access  point  A1;  and  two  other  laptops, 
P2  and  P3,  are  located  in  the  different  place  covered  by  the  wireless  access  point 
A2 ,  which  is  controlled  to  serve  only  P2  and  P3.  Suppose  it  is  known  that  Pi  is 
engaged  in  a  VoIP  conversation.  By  measuring  the  wireless  transmission  epochs 
of  Pi  and  A2,  our  objective  is  to  detect  whether  Pi  is  having  a  VoIP  conversation 
with  any  device  served  by  the  access  point  A2.  In  practice,  there  may  be  additional 
information  available:  packet  sizes,  protocol  types  (TCP  or  UDP),  destination 
addresses,  and  so  on.  However,  here  we  assume  that  we  have  no  access  to  such 
information  due  to  encryption  or  other  countermeasures  employed  by  the  network 
administrator,  and  only  the  timing  measurements  are  available. 

Let  Si  and  S2  denote  the  transmission  processes  of  Pi  and  A2  respectively. 

12Windows  Live  Messenger  2009  (14.0.8089.726)  was  used  for  MSN  VoIP  calls,  and  Wireshark 
network  protocol  analyzer  (ver  1.2.6.)  with  the  AirPcap  classic  adaptor  was  used  to  record  the 
timings  of  wireless  transmissions. 
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Figure  2.9:  If  Pi  has  a  VoIP  conversation  with  either  P2  or  P3,  the  VoIP  packets 
should  depart  from  Pi  and  travel  through  A2. 

Under  H\ ,  Pi  has  a  VoIP  conversation  with  P2,  and  P3  downloads  a  hie  from 
a  distant  FTP  server  with  20kB/s  rate.  Since  A2  transmits  packets  for  both  P2 
and  P3,  its  transmission  timings  of  FTP  packets,  destined  for  P3,  form  the  chaff 
part  of  S2.  Under  Ho,  Pi  and  P2  engage  in  independent  VoIP  conversations  while 
P3  does  the  same  job  as  in  Hi.  Hence,  VoIP  packet  timings  in  Si  and  those  in 
S2  are  independent  under  Hq.  Under  both  hypotheses,  the  timings  of  network 
control/management  packets  from  Pi  and  A2  (except  beacon  frames  of  A2)  are 
also  included  in  Si  and  S2. 

We  used  A  =  150ms  in  the  detection  algorithms,  because  150ms  is  the  upper 
bound  of  acceptable  end-to-end  delays  of  VoIP  packets  recommended  by  ITU-T 
recommendation  G.114  [67].  We  first  tested  DAC  with  various  ranging  from  10 
to  400.  Similar  to  the  result  on  LBL  TCP  traces,  the  miss  detection  probability 
was  higher  than  0.55  in  every  case  due  to  the  bursty  chaff  transmissions  (i.e., 
bursty  FTP  transmissions  from  A2  to  P3).  Table  2.4  shows  the  error  probabilities 
of  NBFD,  the  wavelet  analysis,  and  RP.  As  in  the  test  using  LBL  traces,  we  used 
e  =  0.05  for  NBFD;  for  the  wavelet  analysis  and  RP,  the  crossover  error  rates 
and  the  corresponding  thresholds  are  listed  in  the  table.  NBFD  and  the  wavelet 
analysis  outperformed  RP,  and  they  displayed  vanishing  error  probabilities  as  the 
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Table  2.4:  Performance  on  MSN  VoIP  data.  NBFD  parameters  are  w  =  2 s, 
a  =  A  =  0.15s,  and  e  =  0.05.  The  numbers  of  experiments  are  162,  81,  and  40 
for  observation  duration  80s,  160s,  and  320s,  respectively.  Under  "H0  and  "Hi,  the 
average  rate  is  (A1?  A2)  =  (26.8,  34.9).  The  fraction  of  chaff  in  1~L\  traffic  is  0.18. 


NBFD 

Wavelet 

RP 

Time 

Pf 

Pm 

K 

Pf 

Pm 

th 

Pf 

Pm 

80s 

0.086 

0.056 

0.14 

0.093 

0.093 

949 

0.086 

0.099 

160s 

0 

0.049 

0.17 

0.012 

0.012 

989 

0.049 

0.074 

320s 

0 

0 

0.23 

0 

0 

1005 

0.075 

0.050 

observation  duration  increases. 

I11  all  the  tests  we  executed,  NBFD  and  the  wavelet  analysis  consistently  out¬ 
performed  DAC  and  RP.  Even  though  the  wavelet  analysis  performed  well  over 
most  traces,  we  need  to  recall  that  the  results  in  Table  2.3  and  Table  2.4  were 
possible  because  its  threshold  k  was  set  a  posteriori  to  minimize  its  error  prob¬ 
abilities.  If  neither  a  training  data  set  nor  a  parametric  model  is  available,  we 
have  no  clear  standard  to  set  k.  For  the  further  comparison  of  NBFD  and  the 
wavelet  analysis,  Fig.  2.10  shows  Pp  and  Pm  of  NBFD  and  the  wavelet  analysis 
with  various  thresholds.  We  can  observe  that  the  optimal  k  of  the  wavelet  analysis 
varies  significantly  for  different  observation  durations  and  different  test  traces.  For 
instance,  in  the  test  result  for  synthetic  Poisson  traffic,  k  ~  0.25  gave  the  best 
performance  when  the  observation  duration  is  80s,  but  it  resulted  in  Pm  ~  0.85 
for  the  20s  case.  In  addition,  for  the  fixed  observation  duration  of  80s,  the  optimal 
k  for  the  Poisson  traffic  (~  0.25)  and  that  for  the  VoIP  traffic  (~  0.15)  are  quite 
different.  In  contrast,  for  NBFD,  it  can  be  observed  that  e  =  0.05  results  in  almost 
optimal  performance  in  every  case.  Especially,  in  every  test,  its  false  alarm  prob¬ 
ability  vanished  as  the  observation  duration  increases.  This  suggests  that  under 
"Hq,  the  difference  between  R (t)  and  f(t)  of  NBFD  is  well  bounded  by  e  =  0.05. 
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Wavelet  NBFD 


(a)  Synthetic  Poisson  traffic 

Wavelet  NBFD 


(b)  LBL  TCP  traces 


Wavelet  NBFD 


(c)  MSN  VoIP  Experiment 

Figure  2.10:  False  alarm  and  miss  detection  probabilities  of  the  wavelet  analysis 
and  NBFD  with  various  thresholds. 

2.6  Proofs 


2.6.1  Proof  of  Theorem  2.3.1 


We  use  the  following  lemma  about  the  relation  between  BiBGM  with  A  and 

Bounded-Greedy-Match  (BGM)  [13]  with  2A  (For  the  detail  of  BGM,  refer  to 
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Section  4. A  of  [14]). 

Lemma  2.6.0. 1  Running  BiBGM  on  (s*)f=1  with  A  is  equivalent  to  the  following: 

1.  Increase  all  the  epochs  of  s2  by  A. 

2.  Apply  BGM  with  the  delay  constraint  2A  to  the  modified  measurements. 

Proof  of  Lemma  2.6.0. 1:  Let  s2  be  a  sequence  generated  by  increasing  every 
epoch  in  s2  by  A  (i.e.,  s2(i)  =  s2(i)  +  A,  1  <  i  <  |S2|).  Then,  replacing  s2{n)  with 
s2{n )  —  A  in  Table  2.1  results  in  exactly  the  same  pseudocode  with  BGM  with  2 A 
on  (s1;  s2)  (see  Table  3  in  [14]  for  the  pseudocode).  □ 

Note  that  (a,  b)  e  Si  x  S2  and  \a  —  b\  <  A  if  and  only  if  (a,  b  +  A)  6  §i  x  §2 
and  b  +  A  e  [a,  a  +  2 A].  Hence,  the  optimal  partitioning  of  (s,)^=1  is  equivalent 
to  partitioning  (si,  s2)  into  the  unidirectional  flow  part  (with  the  delay  constraint 
2A)  and  the  chaff  part  such  that  the  flow  part  is  maximized;  BGM  with  2A  was 
proved  in  [13]  to  achieve  the  optimal  partitioning  of  (si,  s2).  Thus,  Lemma  2.6.0. 1 
implies  the  result.  ■ 


2.6.2  Proof  of  Theorem  2.3.2 


Let  S2  denote  the  point  process  with  S2(i)  =  S2(i )  +  A,  i  >  1.  Theorem  4.2  in  [14] 
showed  that  if  we  run  BGM  with  2A  on  (Si,  S2),  the  fraction  of  the  matched 
epochs  in  total  epochs  converges  a.s.  to  the  following: 

2A!A2(1  —  e2A(Al-A:d) 


(Ai  +  A2)(A2-Aie2AAi-^)) 

2AA 


1  +  2AA 

Therefore,  Lemma  2.6.0. 1  implies  the  result. 


if  Ai  ^  A2 
if  Ai  =  A2  =  A 
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2.6.3  Proof  of  Theorem  2.3.3 


We  first  introduce  the  following  lemma  about  the  statistical  behavior  of  R (t)  under 

Hi. 


Lemma  2. 6. 0.2  Suppose  that  the  distributions  of  (Sj)f=1  under  Lii  satisfy  the 
conditions  that  (i)  Si  and  S2  have  rates  Ai  and  A2  respectively,  (ii)  (F1;  F2)  is  a 
bidirectional  flow  with  rate 13  A/,  and  (in)  Wi  and  W2  are  homogeneous  Poisson 
processes.  Then,  under  every  distribution  in  PLi,  lim  inf^^  R(£)  >  a-s •> 


where  0(a1,a2,a/)  is  defined  as 


2Ai  -  2A2(|Ui)e2i<i'-A=> 

(V  +  A,)(l-(^)e2A<A--^) 


A  j  -\-  2A(A  —  Aj-)A 
A(1  +  2(A  —  A/)A) 


if  Ai  =  A2 


A 


Proof  of  Lemma  2. 6. 0.2:  Let  N(t),  Nf(t),  and  Nc(t)  denote  the  number  of 
epochs  of  (S j)?=1,  (Fj)?=1,  and  (W,)-=]  in  [0,  t],  respectively.  M(t)  denotes  the 
number  of  the  matched  epochs  found  by  running  BiBGM  over  (Sj)f=1  in  [0,  t] . 


Consider  running  BiBGM  on  (Fj)?=1  and  (W,)-=1  separately  in  [0,  t]:  M(t)  de¬ 
notes  the  sum  of  the  number  of  the  matched  epochs  in  (Fj)?=1  and  that  in  (Wj)f=1, 
and  R ,w{t)  denotes  the  fraction  of  the  matched  epochs  in  (Wj)-=1.  Theorem  2.3.1 
implies  that  running  BiBGM  on  results  in  a  greater  or  an  equal  number  of 

matched  epochs  than  running  BiBGM  on  (Fj)?=1  and  (Wj)-=1  separately.  There¬ 
fore, 

M(t)  >  M(t)  =  Nf(t)  +  Nc(t)Rw(t), 

M(t)  Nfffi  Nc(t)-  _  N,(t)/t  NJp/t- 
N(t)  -  N(t)  N(t)  1  N(t)/t  N(t)/t  wy  ’’ 

13If  Ni(t),  N2(t),  and  Np{t)  denote  the  number  of  epochs  of  Si,  S2,  and  Fi  in  [0,  t\,  respec¬ 
tively,  then  linp^oo  AdA  =  a  s_  for  j  =  1,  2,  and  lim^oo  AWi  =  a.s.. 
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We  have 


M(t) 
N(t ) 


R (t),  lim^oo 


N(t)/t 


2A  / 
A1+A2 


a.s.,  limt^oo 


Nc(t)/t 

N(t)/t 


A1+A2— 2A  f 
A1+A2 


a.s.,  and  lim^,*,  R w(t)  =  0(a1-a/,a2-a/)  a.s.,  where  </>  is  defined  as  in  Theorem  2.3.2. 


Thus, 


liminfR(t)  > 

t—>  OO 


2A  f  .  A1+A2— 2A  f  1 

A1+A2  “l  A1+A2  v’Oi-A/.Aa-Aj) 


a.s.. 


It  can  be  shown  that  the  right  hand  side  is  ^(a1,a2,a/)- 


□ 


Let  rj  be  any  fixed  number  in  (0,  2  ^^a’^  )  and  T  be  the  suggested  threshold 
for  77.  Then,  there  exists  a  positive  A /  such  that  Ai+Ai  =  j.  Let  h(x)  =  9( 

It  can  be  checked  that  h(x)  is  strictly  increasing  in  [0,  min{Ai,A2}],  and  h(Xf)  is 
equal  to  r. 


(i)  Miss  detection  probability:  Suppose  "Hi  is  true  and  R  >  77  a.s..  Then, 
R  =  (a^+a2)  and  Af  =  fAl 9A2^R  >  A/,  because  R  >  77  >  Lemma  2. 6. 0.2  and  the 
monotonicity  of  h  give 


liminf  R(f)  >  6i(a1ia2,a/)  =  h(Xf)  >  h(Xf)  =  r  a.s. 

Hence,  lim  Pr(R(t)  <  r)  =  0. 

t—>  OO 

(ii)  False  alarm  probability:  Note  that  /i(0)  =  0(Ai,a2)-  Under  T-L0, 
lim  R(t)  =  0(Ai,a2)  —  h(0)  <  h(Xf)  =  r  a.s., 

t—>  OO 

and  thus  lim  Pr(R(f)  >  r)  =  0.  Furthermore,  Lemma  2.6.0. 1  and  Theorem  6.4 

£—>•00 

in  [14]  imply  the  exponential  decay  of  the  false  alarm  probability.  ■ 


2.6.4  Proof  of  Theorem  2.4.1 


We  first  introduce  a  useful  lemma. 
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Figure  2.11:  In  this  example,  M  —  2,  ai;1  —  1,  a  i ;  2  =  3,  ai;3  =  5,  a2;i  =  2,  and 
a2;2  =  4.  We  ran  BiBGM  on  (Sj)f=1  and  marked  the  matches  by  the  arrows.  Some 
matches  consist  of  epochs  in  two  different  partitions,  and  they  are  marked  by  the 
dashed  arrows.  The  matches  consisting  of  epochs  in  a  single  partition  are  marked 
by  the  solid  arrows.  One  can  observe  that  each  solid  arrow  in  (Sj)f=1  can  be  found 
either  in  (Sf})?=1  or  (Sf})?=1. 

Lemma  2. 6. 0.3  Suppose  that  Si  and  S2  are  non-homogeneous  Poisson  processes, 
and  their  local  intensities  always  stay  in  [A  min,  A  max],  where  Xmin  >0.  As  illus¬ 
trated  in  Fig.  2.11,  we  partition  [0,  oo)  into  a  countable  number  of  subintervals:  /,; 
denotes  the  ith  subinterval,  Tj  is  the  length  of  Ii,  and  d(t )  denotes  the  number  of 
Us  with  I,  C  [0,  t}.  Suppose  decreases  to  0  as  t  grows. 

Let  M  be  a  finite  natural  number  and  suppose  we  partition  the  set  {U,  i  >  1} 
into  M  subsets  {/afc  i,  i  >  1},  1  <  k  <  M,  where  (ak-i)i> i,  1  <  k  <  M,  are 
subsequences  of  (1,  2,  3,...).  For  1  <  k  <  M,  we  use  the  epochs  of  (Sj)f=1  in 
(Iak.t)i>i  to  generate  point  processes  (S^)?=1,  as  described  in  Fig.  2.11: 

1.  Initially,  and  S^'1  have  no  epoch. 

2.  For  n  >  1,  for  i  =  1,  2,  subtract  Y^=i  1  Tj  from  all  the  epochs  of  S,  in  the 
interval  Iak.n,  add  'f2'j=i'Tak.j  to  them,  and  add  these  epochs  to  S-^. 

Let  N{t )  denote  the  number  of  epochs  of  (Sj)i=1  in  [0,  t]  and  N(k\t)  denote 
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the  number  of  epochs  of  (S^)f=1;  whose  original  epoch  in  (Si)f=1  is  in  [0,  t];  by 
definition,  N(t )  =  VFe  ran  BiBGM  on  (Sj)?=1  and  let  R (t)  denote 

the  fraction  of  the  matched  epochs  in  the  total  epochs  in  [0,  t].  In  addition,  we  run 
BiBGM  on  (S^)?=1  separately  for  each  k,  and  N(,k> (t)  denotes  the  number  of  the 
matched  epochs  among  the  earliest  N^k\t)  epochs  of  (S^)?=1.  And,  we  define  R(t) 

flS  m)  ' 

Then,  lirn^oo  R (f)  —  R (t)  =  0  almost  surely. 

Proof  of  Lemma  2. 6. 0.3:  Let  Nf(t )  denote  the  number  of  BiBGM-matched 
epochs  of  (Sj)f=1  in  [0,  t].  Then,  by  definition,  R (t)  =  ‘jfpy-  Let  di  denote  the 
time  that  the  Ah  division  occurs;  in  other  words,  d*  is  the  time  that  the  ith  jump 
of  d(t)  occurs.  Formally,  we  say  that  a  BiBGM  match  (t\,  t2),  where  A  is  an  epoch 
of  S*,  is  broken  if  t\  e  Ja,  t2  £  h,  and  a  b.  Let  Nf(t)  denote  the  number  of 
epochs  of  the  unbroken  BiBGM  matches  in  [0,  t].  As  described  in  Fig.  2.11,  if  an 
unbroken  BiBGM  match  (iy,  t2)  in  [0,  t]  is  such  that  t±  and  t2  are  included  in  a 
single  partition  Jafc. .,  then  its  shifted  version  can  be  found  in  the  [0,  tpk{t)]  interval 
of  (Sf^)?=1,  where  pk(t)  is  the  fraction  of  (Ui>i-Gfc-J  Fl  [0,  t]  in  [0,  t\.  I11  addition, 
Theorem  2.3.1  implies  that  Nj,k\t )  is  no  less  than  the  number  of  epochs  belonging 
to  the  shifted  unbroken  matches  in  [0,  tpk(t)}  of  (S^fc  )?=1  (he.,  solid  arrows  in 
Fig.  2.11).  Therefore,  >  Nf(t). 

For  j  =  1,2,  let  Xfii)  denote  the  number  of  epochs  of  S?  in  [max{  di~1^di ;  di  — 
A},  min{f/>+^‘+1 ,  di  +  A}),  where  do  —  —d\.  The  number  of  epochs  of  the  broken 
matches  in  [0,  t]  is  bounded  above  by  Ylt=\  Xi(i)  +  Ef=i  X2(i).  Hence, 

W)  >  W)  -  Eg  Mi)  -  Eg  Mi) 
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There  exist  sequences  of  i.i.d.  Poisson  random  variables  (Ad(i))j>i  and  (X2(i))i> i 
with  mean  2XmaxA  such  that  Xj(i )  <  Xj(i)  a.s.  for  i  >  1,  j  =  1,2.  Hence, 


Z^k=  i 


Nf(t)  >  Nf(t)  -  ESI  Mi)  -  ESI  Mi), 


R (t)  —  R (t)  < 


N(t)  ^  N(t) 


For  j  =  1,  2,  we  have 


lim  sup 

t^-OO 


d{t)/t  ESi  Mi-) 

N(t)/t  d(t) 


0  a.s.. 


Hence, 

limsup(R(t)  —  R (t))  <  0  a.s.. 

t— >oo 


Similarly,  we  can  partition  (S^)^=1  at  time  points  (dk-i)i> i,  where  dk-i  — 
Yl]=iTakji  and  use  the  number  of  unbroken  BiBGM  matches  of  (S^)f=1  in 
[0,  tpk(t)\ ,  1  <  k  <  M,  to  obtain  a  lower  bound  on  the  number  of  BiBGM 
matches  of  (S,j)^=1  in  [0,  t].  Then,  based  on  the  similar  argument,  we  can  de¬ 
rive  liminft_>.00(R(t)  —  R (t))  >  0  a.s..  Hence,  we  have  linp^.0O(R(t)  —  R (t))  =  0 
a.s.,  and  the  proof  is  complete.  □ 


The  proof  of  Theorem  2.4.1  consists  of  two  parts:  one  for  proving  the  vanishing 
false  alarm  probability  under  T-Lq,  and  the  other  for  proving  the  vanishing  miss 
detection  probability  under  "Hi. 


False  Alarm  Probability 

Suppose  that  TLo  is  true  and  the  distribution  of  (Sj)^=1  satisfies  the  assumptions 
of  the  theorem.  Si  and  S2  are  independent  non-homogeneous  Poisson  processes, 
and  so  are  the  output  of  ITA,  Si  and  S2.  Suppose  we  run  BiBGM  on  (Si)^=1  and 
let  R(t)  denote  the  fraction  of  the  matched  epochs  in  the  total  epochs  in  [0,  t] . 
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We  also  run  BiBGM  on  (Sj)?=1  and  let  T(t)  denote  the  fraction  of  the  matched 
epochs  in  the  total  epochs  in  [0,  L2(w*+a)JH-  I11  the  following,  we  will  show  that 
R (t)  —  T(t)  converges  a.s.  to  0. 

Because  limt-**,  =  0,  there  are  at  most  a  countable  number  of  intensity 
changes.  Let  (cj)j>i  denote  the  increasing  sequence  of  the  time  points  at  which 
(Ai(£),  A 2(i))  changes.  We  partition  [0,  oo)  into  a  countable  number  of  subintervals 
{I i  =  [ci- 1,  Ci),  i  >  1}.  For  1  <  k  <  M0,  (ak;i)i> i  denotes  the  increasing  sequence 
of  all  the  indices  of  Rs  in  which  (Ai(i),  A 2(t))  =  ■  For  each  k,  we  use  the  epochs 

of  (Sj)f=1  in  {Iak  i)i> i  generate  a  pair  of  point  processes  (Sf^)f=1,  as  described 
in  Lemma  2. 6. 0.3. 


Let  N(t)  denote  the  number  of  epochs  of  (Sj)?=1  in  [0,  t].  Suppose  we  run 
BiBGM  on  (sf^)?=1  separately  for  1  <  k  <  M0.  N^k\t)  denotes  the  number 
of  epochs  of  (S-^)?=1  in  [0,  tpk(t)\,  and  Njk\t )  denotes  the  number  of  BiBGM- 
matched  epochs  among  those  N^k\t)  epochs.  Then,  Lemma  2. 6. 0.3  implies 

E"iA 


lim 


■t—t  OO 


m 


m 

E"°i  <’(<) 

m 


=  0  a.s..  And, 


m 


By  analyzing  the  limiting  behaviors  of 


tpk(t ) 

t  ,  N^(t)  ,  N{k\t) 


m 


,  pk(t) 


tpk(t ) 


-,  and 


mk)(t) 


fuse  The¬ 


orem  2.3.2),  we  have 
t 


lim  AT/  . 
*-►00  N(t ) 


^Pk[  >  tpk(t) 


. -  w  ELv^r + 

where  p*,  =  lim^oo  Pfc(t)  and  (f)  is  defined  in  Theorem  2.3.2.  Then,  by 
Lemma  2. 6. 0.3, 

>M0  „  /,\k)  (k)s 

Vc..w  ,.(*m 

(2.3) 


,(fc)  I  MW 


a.s. 


lim  R(£)  = 

£— >•  OO 


(fc) 


.(*)> 


a.s. 
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Figure  2.12:  This  figure  illustrates  a  simple  case  that  (Ai (t),  A 2(f))  can  only  take 
either  /4^)  or  fi'P)-  The  bars  filled  with  slant  lines  represent  the 

intervals  in  which  A i(t)  =  and  the  blue  bars  represent  the  intervals  in  which 
A i(t)  —  .  In  this  example,  J2  and  J4  are  in  C. 


Now,  we  will  prove  that  T(f)  also  converges  almost  surely  to  the  same  constant. 
Let  Cj  =  o(w+a)Ci^  depicted  in  Fig.  2.12,  the  local  intensities  of  Si  and  S2, 

denoted  by  (Ai(f),  A2(f)),  may  be  equal  to  (iif  \  n^)  with  j  ^  k ,  and  it  happens 
only  if  any  c*  is  in  [tcLyJ,  +  !))•  Define  C  as  a  set 


{[w(k  —  1),  wk)  :  k  E  N,  3  i  s.t.  c;  E  [w(k  —  1),  wk)} 


As  illustrated  in  Fig.  2.12,  we  partition  [0,  00)  of  (Sj)?=1  into  the  intervals  in  C 
and  the  gap  intervals  between  two  adjacent  intervals  in  C,  and  (Ji)i>  1  denotes 
the  sequence  of  these  intervals  arranged  in  a  time  order,  {cio-i,  i  >  1}  denotes 
the  increasing  sequence  of  the  indices  of  J^s  satisfying  J*  E  C.  For  1  <  k  < 
M0,  {ak-i,  i  >  1}  denotes  the  increasing  sequence  of  the  indices  of  J* s  satisfying 
(Ai(t),  A 2(t))  =  Vf  G  Jj.  Then,  {Jj,  i  >  1}  can  be  partitioned  into  the 
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(M0  +  1)  sets,  {Jak  ii  i  >  1},  0  <  k  <  M0.  For  0  <  k  <  M0,  we  use  the  epochs 
of  (Sj)f=1  in  (Jak.i)i> l  to  generate  (Sf^)f=1,  in  the  same  manner  as  we  generate 
(Sf})f=1  based  on  (/„  .)j>i  in  Lemma  2. 6. 0.3.  Then,  based  on  Lemma  2. 6. 0.3 
and  (S^)?=1  (0  <  k  <  M0),  we  can  use  the  similar  argument  as  in  obtaining 
lirn^oo  R (t)  and  show 


lim  T(t) 

£—>•00 


E 


k= 1  Pk{Vl]  +  V  2fc))0(MW>/iW) 


a.s. 


(2.4) 


From  (2.3)  and  (2.4),  we  can  see  that  R (t)  —  T(t)  converges  almost  surely  to 
0  as  1  grows.  Hence,  for  any  positive  e,  the  false  alarm  probability  vanishes  as  t 
grows: 

lim  Pp(t)  =  lim  Pr(R(t)  —  T (t)  >  e)  =  0 


Miss  Detection  Probability 


Suppose  that  "Hi  is  true  and  the  distribution  of  (Si)|=1  satishes  the  assumptions 
of  the  theorem  including  R  >  rj  a.s.  Due  to  the  almost  sure  convergence  of  R(i), 
‘R  =  lim inf^oo  R(t)  >  rj  a.s.’  is  equivalent  to 


A<‘>  +  A<‘>)  - 


(2.5) 


where  pk  =  lim t^ooPk(t)-  bi  addition,  the  first  assumption  of  the  theorem  guar¬ 
antees  that  Si  and  S2  are  independent  non-homogeneous  Poisson  processes.  We 
run  BiBGM  on  (Sj)^=1  and  let  R(t)  denote  the  fraction  of  the  matched  epochs 
in  the  total  epochs  in  [0,  t\.  We  also  run  BiBGM  on  (Sj)^=1  and  let  T (t)  denote 
the  fraction  of  the  matched  epochs  in  the  total  epochs  in  [0,  L 2(J+a) -I H •  First  of 
all,  by  following  exactly  the  same  steps  as  in  the  proof  of  vanishing  false  alarm 
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probability,  we  can  derive 

lim  fit) 

£—>•00 


Efi 


(k) 


+  A?> 


)V>, 


dfc)\ 


EMMAS' 


(*) 


+  Af ; 


a.s. 


(2.6) 


Let  (cj)j>i  denote  the  increasing  sequence  of  the  time  points  at  which  \{t) 
changes,  and  we  partition  [0,  00)  into  a  countable  number  of  subintervals  {It  = 
[q_  1,  Cj),  i  >  1}.  For  1  <  k  <  M1;  let  (a^. ,;),>(  denote  the  increasing  sequence  of 
all  the  indices  of  /?:s  satisfying  A(f)  =  A^i,  Vf  G  R.  We  use  the  epochs  of  (Sj)^=1 
in  (/afc.Ji>i  to  generate  a  pair  of  point  processes  (S|fc  )f=1,  as  in  Lemma  2. 6. 0.3. 
Then,  based  on  Lemma  2. 6. 0.2,  Lemma  2. 6. 0.3,  and  (Sf^)?=1  (1  <  k  <  Mi),  we 
can  use  the  similar  argument  as  in  obtaining  linq^oo  R(f)  in  the  proof  of  vanishing 
false  alarm  probability  and  derive 


liminfR(f)  > 

t—too 


Mk=  1 


Pk{X\ 


(k) 


+  \^)e 


(R 


W  Ufc) 


sr^Mi 

Mk= 1 


Pt(Af>  +  Af) 


a.s. 


where  9  is  defined  in  Lemma  2. 6. 0.2.  For  fixed  Ax  and  A2,  6,(Ai,a2,a/)  is  a  strictly 
increasing  function  of  A/,  and  it  decreases  to  0(Ai,a2)  as  A/  decays  to  0.  Hence,  if 
we  define  7  as 


Efc=\  Pk(^\  ’  +  ^2  O^aW^^.a^+a^)  E*=i  Pfc(Aifc)  +  A^,)</>(Aw  A(fc)) 


(fc)  A(fc) 


k(fc) 


AW 


nun 

(Pfc)fcl1! 


E"‘iP*(Ar+AD 


(fc) 


AW 


EEiA»fc(Ar+An 


(fc) 


AW 


where  the  minimization  is  over  :  (2-5)  holds},  then  it  can  be  easily 


seen 


that  7  is  strictly  greater  than  0.  Set  e  =  |y,  and  let  e  be  an  arbitrary  number  in 


(0,  e].  Then,  if  the  condition  (2.5)  holds, 


lim inf(R(t)  -  T (t))  >  7  >  2e  a.s.. 

t—t  OO 

Therefore,  as  long  as  the  condition  (2.5)  holds,  the  miss  detection  probability 
vanishes  as  t  grows: 

lim  Pm(^)  =  lim  Pr(R(f)  —  T(f)  <  e)  =  0. 

£— >•  00  t—>  00 


56 


72 


CHAPTER  3 

TOPOLOGY  ATTACK  OF  A  POWER  GRID 

3.1  Introduction 

A  defining  feature  of  a  smart  grid  is  its  abilities  to  monitor  the  state  of  a  large 
power  grid,  to  adapt  to  changing  operating  conditions,  and  to  react  intelligently 
to  contingencies,  all  of  which  depend  critically  on  a  reliable  and  secure  cyber¬ 
infrastructure.  It  has  been  widely  recognized  that  the  heavy  reliance  on  a  wide 
area  communications  network  for  grid  monitoring  and  real-time  operation  comes 
with  increasing  security  risks  of  cyber-attacks.  See  [72]  for  a  vulnerability  analysis 
of  energy  delivery  control  systems. 

While  information  security  has  been  a  major  focus  of  research  for  over  half  a 
century,  the  mechanisms  and  the  impacts  of  attack  on  cyber  physical  systems  such 
as  the  power  grid  are  not  yet  well  understood,  and  effective  countermeasures  are 
still  lacking. 

We  consider  a  form  of  “man-in-the-middle”  (MiM)  attack  [73]  on  the  topology 
of  a  power  grid.  An  MiM  attack  exploits  the  lack  of  authentication  in  a  system, 
which  allows  an  adversary  to  impersonate  a  legitimate  participant.  In  the  context 
of  monitoring  a  transmission  grid,  sophisticated  authentications  are  typically  not 
implemented  due  to  the  need  of  reducing  communication  delay  and  the  presence  of 
legacy  communication  equipment.  If  an  adversary  is  able  to  gain  access  to  remote 
terminal  units  (RTUs)  or  local  data  concentrators,  it  is  possible  for  the  adversary 
to  replace  actual  data  packets  with  carefully  constructed  malicious  data  packets 
and  impersonate  a  valid  data  source. 
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MiM  attacks  on  a  power  grid  may  have  severe  consequences.  The  adversary 
can  mislead  the  control  center  that  the  grid  is  operating  under  a  topology  different 
from  that  in  reality.  Such  an  attack,  if  launched  successfully  and  undetected  by 
the  control  center,  will  have  serious  implications:  a  grid  that  is  under  stress  may 
appear  to  be  normal  to  the  operator  thereby  delaying  the  deployment  of  necessary 
measures  to  ensure  stability.  Similarly,  a  grid  operating  normally  may  appear  to 
be  under  stress  to  the  operator,  potentially  causing  load  shedding  and  other  costly 
remedial  actions  by  the  operator. 

Launching  a  topology  attack,  fortunately,  is  not  easy;  a  modern  energy  man¬ 
agement  system  is  equipped  with  relatively  sophisticated  bad  data  and  topology 
error  detectors,  which  alerts  the  operator  that  either  the  data  in  use  are  suspi¬ 
cious  or  there  may  indeed  be  changes  in  the  network  topology.  When  there  are 
inconsistencies  between  the  estimated  network  topology  (estimated  mostly  using 
switch  and  breaker  states)  and  the  meter  data  (e.g.,  there  is  significant  amount  of 
power  flow  on  a  line  disconnected  in  the  estimated  topology,)  the  operator  takes 
actions  to  validate  the  data  in  use.  Only  if  data  and  the  estimated  topology  pass 
the  bad  data  test,  will  the  topology  change  be  accepted  and  updates  be  made  for 
subsequent  actions. 

The  attacks  that  are  perhaps  the  most  dangerous  are  those  that  pass  the  bad 
data  detection  so  that  the  control  center  accepts  the  change  (or  the  lack  of  change) 
of  network  topology.  To  launch  such  attacks,  the  adversary  needs  to  modify  si¬ 
multaneously  the  meter  data  and  the  network  data  (switch  and  breaker  states)  in 
such  a  way  that  the  estimated  topology  is  consistent  with  the  data.  Such  attacks 
are  referred  to  as  undetectable  attacks ;  they  are  the  main  focus  of  this  study. 
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3.1.1  Summary  of  Results  and  Organization 

We  aim  to  achieve  two  objectives.  First,  we  characterize  conditions  under  which 
undetectable  attacks  are  possible,  given  a  set  of  vulnerable  meters  that  may  be 
controlled  by  an  adversary.  To  this  end,  we  consider  two  attack  regimes  based  on 
the  information  set  available  to  the  attacker.  The  more  information  the  attacker 
has,  the  stronger  its  ability  to  launch  a  sophisticated  attack  that  is  hard  to  detect. 

The  global  information  regime  is  where  the  attacker  can  observe  all  meter  and 
network  data  before  altering  the  adversary-controlled  part  of  them.  Although  it  is 
unlikely  in  practice  that  an  adversary  is  able  to  operate  in  such  a  regime,  in  ana¬ 
lyzing  the  impact  of  attacks,  it  is  typical  to  consider  the  worst  case  by  granting  the 
adversary  additional  power.  In  Section  3.3,  we  present  a  necessary  and  sufficient 
algebraic  condition  under  which,  given  a  set  of  adversary  controlled  meters,  there 
exists  an  undetectable  attack  that  misleads  the  control  center  with  an  incorrect 
“target”  topology.  This  algebraic  condition  provides  not  only  numerical  ways  to 
check  if  the  grid  is  vulnerable  to  undetectable  attacks  but  also  insights  into  which 
meters  to  protect  to  defend  against  topology  attacks.  We  also  provide  specific 
constructions  of  attacks  and  show  certain  optimality  of  the  proposed  attacks. 

A  more  practically  significant  situation  is  the  local  information  regime  where 
the  attacker  has  only  local  information  from  those  meters  it  has  gained  control. 
Under  certain  conditions,  undetectable  attacks  exist  and  can  be  implemented  easily 
based  on  simple  heuristics.  We  present  in  Section  3.4  intuitions  behind  such  simple 
attacks  and  implementation  details. 

The  second  objective  is  to  provide  conditions  under  which  topology  attack 
cannot  be  made  undetectable.  Such  a  condition,  even  if  it  may  not  be  the  tightest, 
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provides  insights  into  defense  mechanisms  against  topology  attacks.  In  Section  3.5, 
we  show  that  if  a  set  of  meters  satisfying  a  certain  branch  covering  property  are 
protected,  then  topology  attacks  can  always  be  detected.  In  practice,  protecting  a 
meter  may  be  carried  out  at  multiple  levels,  from  physical  protection  measures  to 
software  protection  schemes  using  more  sophisticated  authentication  protocols. 

The  rest  of  the  chapter  is  organized  as  follows.  Section  3.2  presents  mathemati¬ 
cal  models  of  state  estimation,  bad  data  test,  and  topology  attacks.  In  Section  3.3, 
we  study  topology  attacks  in  the  global  information  regime.  The  algebraic  condi¬ 
tion  for  an  undetectable  attack  is  presented,  and  construction  of  a  cost-effective 
undetectable  attack  is  provided.  Section  3.4  presents  a  heuristic  attack  for  the 
attacker  with  local  information.  Based  on  the  algebraic  condition  presented  in 
Section  3.3,  Section  3.5  provides  a  graph  theoretical  strategy  to  add  protection  to 
a  subset  of  meters  to  prevent  undetectable  attacks.  Section  3.6  presents  simulation 
results  to  demonstrate  practical  uses  of  our  analysis  and  feasibility  of  the  proposed 
attacks. 


3.2  Preliminaries 

In  this  section,  we  present  models  for  the  power  network,  measurements,  and 
adversary  attacks.  We  also  summarize  essential  operations  such  as  state  estimation 
and  bad  data  detection  that  are  targets  of  data  attacks. 
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3.2.1  Network  and  Measurement  Models 

The  control  center  receives  two  types  of  data  from  meters  and  sensors  deployed 
throughout  the  grid.  One  is  the  digital  network  data  s  G  {0,  l}d,  which  can  be 
represented  as  a  string  of  binary  bits  indicating  the  on  and  off  states  of  various 
switches  and  line  breakers.  The  second  type  is  the  analog  meter  data  z,  which  is 
a  vector  of  bus  injection  and  line  flow  measurements. 

Without  an  attack  or  a  sensing  error,  s  gives  the  true  breaker  states.  Each  s  e 
(0,  l}d  corresponds  to  a  system  topology,  which  is  represented  by  a  directed  graph 
9  =  (V,  £),  where  V  is  the  set  of  buses  and  £  is  the  set  of  connected  transmission 
lines.  For  each  physical  transmission  line  between  two  buses  (e.g.,  i  and  j),  we 
assign  an  arbitrary  direction  for  the  line  (e.g.,  ( i,j )),  and  (i,j)  is  in  £  if  and 
only  if  the  line  is  connected.  In  addition,  £o  denotes  the  set  of  all  lines  (with 
the  assigned  directions),  both  connected  and  disconnected.  Assigning  arbitrary 
directions  for  lines  is  not  intended  to  deliver  any  physical  meaning,  but  only  for 
ease  of  presentation. 

The  state  of  a  power  system  is  defined  as  the  vector  x  of  voltage  phasors  on  all 
buses.  In  the  absence  of  attacks  and  measurement  noise,  the  meter  data  z  collected 
by  the  SCADA  system  are  related  to  the  system  state  x  and  the  system  topology 
9  via  the  AC  power  flow  model  [51]: 

z  = /i(x,  S)  +  e  (3.1) 

where  z  typically  includes  real  and  reactive  parts  of  bus  injection  and  line  flow 
measurements,  h  is  the  nonlinear  measurement  function  of  x  and  9,  and  e  the 
additive  noise. 

A  simplified  model,  one  that  is  often  used  in  real-time  operations  such  as  the 
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computation  of  real-time  LMP,  is  the  so-called  DC  model  [51]  where  the  nonlinear 
function  h  is  linearized  near  the  operating  point.  In  particular,  the  DC  model  is 
given  by 

z  =  Hx  +  e  (3.2) 

where  z  G  consists  of  only  the  real  parts  of  injection  and  line  flow  measure¬ 
ments,  H  e  Mmxn  is  the  measurement  matrix,  x  e  Mn  is  the  state  vector  consisting 
of  voltage  phase  angles  at  all  buses  except  the  slack  bus,  and  e  e  Mm  is  the  Gaus¬ 
sian  measurement  noise  with  a  diagonal  covariance  matrix  E. 


The  fact  that  the  measurement  matrix  H  depends  on  the  network  topology  9 
is  important,  although  we  use  the  notation  H  without  explicit  association  with 
its  topology  9  for  notational  convenience.  For  ease  of  presentation,  consider  the 
noiseless  measurement  z  =  Hx.  If  an  entry  Zk  of  z  is  the  measurement  of  the  line 
flow  from  i  to  j  of  a  connected  line  in  9,  Zk  is  Bi3(xi  —  x3)  where  BtJ  is  the  line 
susceptance  and  xt  is  the  voltage  phase  angle  at  bus  i.  The  corresponding  row  of 
H  is  equal  to 


k(*j)  _  [0  •  '  '  0 

ith  entry 


0  •  •  •  0  —Bj 


(3.3) 


7 1 h  entry 


On  the  other  hand,  if  Zk  is  the  measurement  of  the  line  flow  through  a  disconnected 
line  in  9,  Zk  is  zero,  and  the  corresponding  row  of  H  consists  of  all  zero  entries. 
If  Zk  is  the  measurement  of  bus  injection  at  i,  it  is  the  sum  of  all  the  outgoing 
line  flows  from  i,  and  the  corresponding  row  of  H  is  the  sum  of  the  row  vectors 
corresponding  to  all  the  outgoing  line  flows. 


We  consider  both  AC  and  DC  power  flow  models.  The  DC  model  allows  us 
to  obtain  a  succinct  characterization  of  undetectable  attacks  as  described  in  Sec¬ 
tion  3.3.  However,  these  results  hold  only  locally  around  the  operating  point, 
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because  the  results  are  obtained  from  the  linearized  model.  General  results  for 
the  more  realistic  (nonlinear)  AC  model  are  difficult  to  obtain.  We  present  in 
Section  3.4  a  heuristic  attack  that  are  undetectable  for  both  AC  and  DC  models. 

It  was  shown  in  [39]  that  using  the  DC  model  and  linear  state  estimator  in 
numerical  analysis  of  an  attack  tends  to  exaggerate  the  impact  of  the  attack. 
Hence,  for  accurate  analysis,  we  use  the  AC  model  and  nonlinear  state  estimator 
in  the  numerical  simulations  presented  in  Section  3.6. 

3.2.2  Adversary  Model 

The  adversary  aims  at  modifying  the  topology  estimate  from  S  =  (V,  £)  to  a 
different  “target”  topology  9  =  (V,  £).  Note  that  S  and  S  have  the  same  set 
of  vertices.  In  other  words,  we  only  consider  the  attacks  aimed  at  perturbing 
transmission  line  connectivities1.  In  addition,  we  assume  that  the  power  system 
is  observable  regardless  an  attack  is  present  or  not:  i.e.,  the  measurement  matrix 
in  the  DC  model  always  has  full  rank.  This  means  that  the  adversary  avoids 
misleading  the  control  center  with  drastic  system  changes  (e.g.,  division  into  two 
di connected  parts)  that  may  draw  too  much  attention  of  the  control  center2.  We 
call  the  lines  not  common  to  both  £  and  £  (he.,  lines  in  £A£  =  (£  \  £)  U  (£  \  £)) 
target  lines  and  the  buses  at  the  ends  of  the  target  lines  target  buses. 

To  alter  the  network  topology,  the  adversary  launches  a  man-in-the-middle 

1The  attacks  aiming  to  split  or  combine  buses  are  out  of  scope  of  this  chapter.  Such  attacks 
require  modifying  the  measurements  of  breaker  states  inside  substations.  If  the  control  center 
employs  generalized  state  estimation  [74],  such  modification  invokes  substation- level  state  esti¬ 
mation  which  leads  to  a  robust  bad  data  test.  Hence,  such  attacks  are  harder  to  avoid  detection. 

2In  fact,  the  results  to  be  presented  in  this  chapter  also  hold  for  the  general  case  where  the 
target  topology  can  be  anything  (e.g.,  the  system  may  be  divided  into  several  disconnected  parts), 
if  the  control  center  employs  the  same  bad  data  test  even  when  the  network  is  unobservable. 
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<  Generalized  State  Estimator  > 


Figure  3.1:  Attack  Model  with  Generalized  State  Estimation 

attack  as  described  in  Fig.  3.1:  it  intercepts  (s,  z)  from  RTUs,  modifies  part  of 
them,  and  forwards  the  modified  version  (s,  z)  to  the  control  center. 


Throughout  this  chapter,  except  in  Section  3.4,  we  assume  that  the  adversary- 
lias  global  information,  i.e.,  it  knows  network  parameters  and  observes  all  entries 
of  (s,  z)  before  launching  the  attack,  although  it  may  modify  only  the  entries  it 
gained  control  of.  Such  an  unlimited  access  to  network  parameters  and  data  is 
a  huge  advantage  to  the  attacker.  In  Section  3.5,  countermeasures  are  designed 
under  this  assumption  so  that  they  can  be  robust  to  such  worst  case  attacks. 


The  mathematical  model  of  an  attack  to  modify  9  to  9  is  as  follows  (the 
notation  that  a  bar  is  on  a  variable  denotes  the  value  modified  by  the  adversary): 

s  =s  +  b  (mod  2), 

(3.4) 

z  =z  +  a(z),  a(z )  e  A, 

where  s  is  the  modified  network  data  corresponding  to  9,  b  G  {0,  l}d  represents 
the  modifications  on  the  network  data  s,  a(z)  e  Mm  denotes  the  attack  vector 
added  to  the  meter  data  z,  and  A  C  Mm  denotes  the  subspace  of  feasible  attack 
vectors. 


We  assume  that  the  adversary  can  modify  the  network  data  accordingly  for  any 


64 


80 


target  topology  that  deems  to  be  valid  to  the  control  center.  This  is  the  opposite  of 
the  assumption  employed  by  most  existing  studies  on  state  attacks  where  network 
data  that  specify  the  topology  are  not  under  attack. 

For  the  attack  on  analog  meter  data,  we  use  the  notation  a(z)  to  emphasize 
that  the  adversary  can  design  the  attack  vector  based  on  the  whole  meter  data 
z.  This  assumption  will  be  relaxed  in  Section  3.4  to  study  an  attack  with  local 
information.  In  addition,  A  has  a  form  of  {c  G  :  Q  =  0,  i  G  Js}  where  3s 
is  the  set  of  indices  of  secure  meter  data  entries  that  the  adversary  cannot  alter 
and  {1, . . . ,  m}  \  3s  represents  the  adversary-controlled  entries.  Note  that  A  fully 
characterizes  the  power  of  the  adversary,  and  the  mapping  a  :  Mm  — >•  A  fully 
defines  the  attack  strategy. 

3.2.3  State  Estimation,  Bad  Data  Test,  and  Undetectable 
Attacks 

As  illustrated  in  Fig.  3.1,  the  control  center  executes  generalized  state  estimation 
(GSE)  [74]  with  network  and  meter  data  as  inputs;  the  inputs  are  (s,  z)  in  the 
absence  of  an  attack  and  (s,  z)  if  there  is  an  attack.  GSE  regards  both  network 
and  meter  data  as  possibly  erroneous.  Once  the  bad  data  test  detects  inconsistency 
among  data  and  estimates,  GSE  hlters  out  the  outliers  from  the  data  and  searches 
for  a  new  pair  of  topology  and  state  estimates  that  fit  the  data  best.  Our  focus 
is  on  the  attacks  that  can  pass  the  bad  data  test  such  that  no  alarm  is  raised  by 
GSE. 

Under  the  general  AC  model  (3.1),  if  (s,z)  is  the  input  to  GSE,  and  S  is  the 
topology  corresponding  to  s,  the  control  center  obtains  the  weighted  least  squares 
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(WLS)  estimate  of  the  state  x: 

x  =  arg min(z  -  h( y,  S))*S'i(z  -  h( y,  §)). 
y 

Note  that  S  =  S  in  the  absence  of  an  attack  while  S  =  S  in  the  presence  of  an 
attack.  In  practice,  nonlinear  WLS  estimation  is  implemented  numerically  [51]. 

Under  the  DC  model  (3.2),  the  WLS  state  estimator  is  a  linear  estimator  with 
a  closed  form  expression 

x  =  arg  min(z  —  Lfy)iS^1(z  —  Hy) 
y 

where  H  is  the  measurement  matrix  for  S-  The  linear  estimator  is  sometimes  used 
as  part  of  an  iterative  procedure  to  obtain  the  nonlinear  WLS  solution. 

The  residue  error  is  often  used  at  the  control  center  for  bad  data  detection  [51]. 
In  the  so-called  J(x)  test  [40],  the  weighted  least  squares  error 

J(x)  =  (z  -  h(x,  S))iS_1(z  -  h(x,  §)) 


is  used  in  a  threshold  test: 


{bad  data  if  J(x)  >  r, 
good  data  if  J(x)  <  r, 


(3.5) 


where  r  is  the  detection  threshold,  and  it  is  determined  to  satisfy  a  certain  false 
alarm  constraint  a. 


We  define  that  an  attack  is  undetectable  if  its  detection  probability  is  as  low  as 
the  false  alarm  rate  of  the  detector.  We  assume  that  the  J(x)  test  is  used  as  the 
bad  data  detector. 
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Definition  3.2.1  An  attack  a  to  modify  9  to  S  is  said  to  be  undetectable  if,  for 
any  true  state  x,  the  J (it) -test  with  any  false  alarm  constraint  detects  the  attack 
with  the  detection  probability  no  greater  than  its  false  alarm  rate. 

In  the  absence  of  noise,  the  only  source  of  bad  data  is,  presumably,  an  attack. 
In  this  case,  the  probabilistic  statement  of  undetectability  becomes  a  deterministic 
one.  A  data  attack  (z  +  a(z),s)  that  modifies  the  topology  from  9  to  9  is  unde¬ 
tectable  if  for  every  noiseless  measurement  z,  there  exists  a  state  vector  x  such 
that  z  +  a(z)  =  h(x,  S).  Unfortunately,  such  a  nonlinear  condition  is  difficult  to 
check. 

Under  the  DC  model,  however,  the  undetectability  condition  has  a  simple  al¬ 
gebraic  form.  Let  (s,  z)  be  the  input  to  GSE  and  H  is  the  measurement  matrix 
for  the  topology  corresponding  to  s.  In  the  presence  of  an  attack,  GSE  receives 
(s,z)  instead  of  (s,z),  and  H- the  measurement  matrix  for  the  target  topology 
9-replaces  H .  In  the  absence  of  noise,  the  J(x)-detector  is  equivalent  to  checking 
whether  the  received  meter  data  is  in  the  column  space  of  the  valid  measurement 
matrix.  Thus,  the  equivalent  undetectable  topology  attack  can  be  defined  by  the 
following  easily  checkable  form: 

Definition  3.2.2  An  attack  to  modify  9  to  9  with  the  attack  vector  a  is  said  to 
be  undetectable  if 

z  +  a(z)  e  Col(H),  Vz  e  Col(H ),  (3.6) 

where  H  and  H  are  the  measurement  matrices  for  9  and  9  respectively,  and  Col(H) 
is  the  column  space  of  H  and  Col(H)  the  column  space  of  H . 
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3.3  Topology  Attack  with  Global  Information 

We  assume  the  DC  model  (3.2)  and  present  the  result  for  the  existence  of  unde¬ 
tectable  topology  attacks. 

3.3.1  Condition  for  an  Undetectable  Attack 

We  first  derive  a  necessary  and  sufficient  algebraic  condition  for  existence  of  an 
undetectable  attack  that  modifies  S  to  S  with  the  subspace  A  of  feasible  attack 
vectors.  To  motivate  the  general  result,  consider  first  the  noiseless  case. 

Noiseless  Measurement  Case 

Suppose  there  is  an  undetectable  attack  a  with  a(z)  G  A,  Vz  G  Col (H).  Then, 
undetectability  implies  that  z  +  a(z)  G  Col(i/),  Vz  G  Col (H),  and  thus,  Col (H)  C 
Col(H,A)3 

Now  suppose  Co\(H)  C  Col  (If,  A).  There  exists  a  basis  (ci, . . . ,  cp,  di, . . . ,  dg} 
of  Col (H,  VI)  such  that  {ci, . . . ,  cp}  is  a  subset  of  columns  of  H  and  {di, . . . ,  dg} 
is  a  set  of  linearly  independent  vectors  in  A.  For  any  z  G  Col(H),  since  Col (H)  C 
Col(/V,  VI),  there  exist  unique  (cq)f=1  G  and  (j3j)^=1  such  that  z  =  + 

Y'j  i  -  If  we  set  a(z)  =  •  Y'i  i  z  +  a(z)  =  Y,Pl=iaici  e  Col(H).  In 
addition,  a(z)  G  VI  for  all  z.  Hence,  there  exists  an  undetectable  attack  with  the 
subspace  VI  of  feasible  attack  vectors. 

The  above  arguments  lead  to  the  following  theorem. 

3Co\(H,A)  denotes  the  space  spanned  by  the  columns  of  H  and  a  basis  of  VI. 
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Theorem  3.3.1  There  exists  an  undetectable  attack  to  modify  9  to  9  with  the 
subspace  A  of  feasible  attack  vectors  if  and  only  if  Col(H)  C  Col(H ,  A). 


Noisy  Measurement  Case 

The  following  theorem  states  that  the  algebraic  condition  in  Theorem  3.3.1  can 
also  be  used  in  the  noisy  measurement  case. 

Theorem  3.3.2  There  exists  an  undetectable  attack  to  modify  9  to  9  with  the 
subspace  A  of  feasible  attack  vectors  if  and  only  if  Col(H)  C  Col(H ,  A). 

In  addition,  if  an  attack  a  is  such  that  Col(H)  Col(H,  A),  then  for  almost 
every 4  x  e  when  x  is  the  true  state,  the  detection  probability  for  the  attack 
approaches  1  as  the  noise  variances  uniformly  decrease  to  0  (i.e.,  max):(E„;);  where 
Tin  is  the  (i,i)  entry  ofT,  decays  to  0). 

Proof:  See  Section  3.7.  ■ 

Note  that  when  the  algebraic  condition  is  not  met,  the  attack  can  be  detected 
with  high  probability  if  the  noise  variances  are  sufficiently  small.  With  this  alge¬ 
braic  condition,  we  can  check  whether  the  adversary  can  launch  an  undetectable 
attack  with  A  for  the  target  9-  The  condition  will  be  used  in  Section  3.5  to  con¬ 
struct  a  meter  protection  strategy  to  disable  undetectable  attacks  for  any  target 
topology. 

By  finding  the  smallest  dimension  of  A  satisfying  the  condition,  we  can  also 

characterize  the  minimum  cost  of  undetectable  attacks  for  9;  in  the  adversary’s 

4This  means  “for  all  x  £  K"  \  S,  for  some  S  C  K"  with  a  zero  Lebesgue  measure”. 
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point  of  view,  a  smaller  dimension  of  A  is  preferred,  because  increasing  the  dimen¬ 
sion  of  A  necessitates  compromising  more  RTUs  or  communication  devices.  In  the 
following  section,  we  present  an  undetectable  attack  requiring  a  small  number  of 
data  modifications  and  prove  its  optimality  for  a  class  of  targets  by  utilizing  the 
algebraic  condition. 

3.3.2  State-preserving  Attack 

This  section  presents  a  simple  undetectable  attack,  referred  to  as  state-preserving 
attack.  As  the  name  suggests,  the  attack  intentionally  preserves  the  state  in  order 
to  have  a  sparse  attack  vector.  We  again  motivate  our  result  by  considering  first 
the  noiseless  case. 

Noiseless  Measurement  Case 

Given  z  =  7/x  e  Col(77),  the  state-preserving  attack  sets  a(z)  equal  to  (H  —  i7)x. 
Then,  z  +  a(z)  =  7/x  G  Col (//);  the  attack  is  undetectable.  Note  that  the  state 
x  remains  the  same  after  the  attack.  Since  H  has  full  column  rank,  a(z)  can  be 
simply  calculated  as 

a(z)  =  {H  -  TT)x  =(H-  H)(HtH)~1Htz.  (3.7) 

For  a(z)  above  to  be  a  valid  attack  vector,  it  is  necessary  to  be  a  sparse  vector 
constrained  by  the  meters,  the  data  of  which  can  be  altered  by  the  adversary. 

To  see  an  intuitive  reason  why  ihx  —  Hx.  is  sparse,  consider  the  simple  case  that 
a  line  is  removed  from  the  topology  while  the  state  is  preserved.  In  this  case,  the 
line  flows  through  all  the  lines,  except  the  removed  line,  stay  the  same.  Because, 
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the  line  flow  from  i  to  j  is  determined  by  (i)  (xi,Xj)  and  (ii)  whether  i  and  j  are 
connected,  and  for  most  lines,  these  two  factors  remain  the  same.  Hence,  only  few 
entries  are  different  between  £/x  and  i/x.  Below,  we  will  show  that,  for  all  state 
x  G  Mn,  all  entries  of  ( H  —  H)x.  are  zeros  except  those  associated  with  the  target 

lines. 

As  noted  in  [26],  H  can  be  decomposed  as  H  =  Mi? A*,  where  M  G  Wmxl  is  the 
measurement-to-line  incidence  matrix  with  l  =  |£o|,  B  G  M,lxl  is  a  diagonal  matrix 
with  the  line  susceptances  in  the  diagonal  entries,  and  At  G  M.lxn  is  the  linc-to-bus 
incidence  matrix.  Each  column  of  M  (each  row  of  A1)  corresponds  to  a  distinct  line 
in  £0.  For  1  <  j  <  l,  if  the  j th  column  of  M  corresponds  to  (a,  b )  G  £o,  let  iA  =  a 
and  vj  =  b.  Then,  M  is  defined  such  that  Mi3  =  ±1  if  the  ith  meter  (the  meter 
corresponding  to  the  ith  row  of  M)  measures  (i)  the  line  flow  from  v±  to  vj  or  (ii) 
the  injection  at  bus  vf;  otherwise,  Ml3  =  0.  For  At ,  (At)3t  =  ±1  if  vf  =  i,  and  the 
line  corresponding  to  the  jth  row  of  At  (or  equivalently  the  jth  column  of  M)  is 
connected  in  S;  otherwise,  (At)ji  =  0.  Note  that  M  and  B  are  independent  of  the 
topology,  but  At  does  depend  on  9-  Fig-  3.2  provides  an  example  to  illustrate  the 
structures  of  M,  B,  and  A*.  Similarly,  H  is  decomposed  as  H  —  MB  A*. 

As  illustrated  in  Fig.  3.2,  the  entries  of  i?A*x  G  Mixl  correspond  to  the  line 
flows  of  all  the  lines  in  £0  when  the  state  is  x  and  the  topology  is  9-  Similarly,  i?A4x 
is  the  vector  of  line  flows  when  the  state  is  x  and  the  topology  is  9-  Since  the  states 
are  the  same,  the  kth  entry  of  i?A*x  and  that  of  i?A*x  are  different  only  if  the 
corresponding  line  is  connected  in  one  of  9  and  9  while  disconnected  in  the  other. 
Therefore,  ( BA t  —  i?A<)x  has  all  zero  entries  except  the  entries  corresponding  to 
the  lines  in  £A£.  Specifically,  the  entry  corresponding  to  (i,j)  G  £  \  £  assumes 
/jj(x)  =  Bij(xi-Xj),  and  the  entry  corresponding  to  (i ,  j )  G  £\£  assumes  — /^(x). 
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Figure  3.2:  The  measurement,  line,  or  bus  corresponding  to  each  row  or  column  is 
labled.  Bus  1  is  the  slack  bus.  For  the  rows  of  M,  i  denotes  the  injection  meter  at 
bus  i,  and  (i,j)  the  meter  for  the  line  flow  from  i  to  j. 

Hence,  (H  —  i/)x  =  M(BAt  —  BA‘)x  is  equal  to 

XJ  /d(x)m(m)-  /p'(x)mhJ)  (3-8) 

(i,j)e£\£  (j,i)e£\£ 

where  mpm  is  the  column  vector  of  M  corresponding  to  Note  that  mpj)  is 

a  sparse  vector  that  has  nonzero  entries  only  at  the  rows  corresponding  to  the  line 
flow  meters  on  the  line  (i,j)  and  the  injection  meters  at  i  and  j. 


From  (3.8),  for  any  state  x  G  M”,  (H  —  iJ)x  is  a  linear  combination  of  elements 
in  {mpj)  :  (i,j)  G  £A£}.  Hence,  the  state-preserving  attack,  which  sets  a(z)  = 
(H-H)x.,  modifies  at  most  the  line  flow  meters  on  the  target  lines  and  the  injection 
meters  at  the  target  buses. 


We  now  show  in  the  next  two  theorems  that,  under  certain  conditions,  the  state- 
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preserving  attack  has  the  least  cost  in  the  sense  that  it  requires  the  adversary  to 
modify  the  smallest  number  of  meter  data  (i.e.,  the  smallest  dimension  for  A). 


Theorem  3.3.3  Assume  that  (i)  the  actual  and  target  topologies  differ  by  only 
one  line,  i.e.,  |£A£|  =  1,  and  (ii)  every  line  in  E,  incident  from  or  to  any  target 
bus  with  an  injection  meter,  has  at  least  one  line  flow  meter  on  it.  Then,  among 
all  undetectable  attacks,  the  state-preserving  attack  modifies  the  smallest  number 
of  meters,  which  is  the  total  number  of  line  flow  and  injection  meters  located  on 
the  target  line  and  target  buses. 

Proof:  See  Section  3.7.  ■ 


Another  scenario  that  the  state-preserving  attack  has  the  minimum  cost  is 
when  the  adversary  aims  to  delete  lines  from  the  actual  topology. 


Theorem  3.3.4  Let  S*  and  S*  denote  the  undirected  versions  of  S  and  S  respec¬ 
tively.  Suppose  that  the  adversary  aims  to  remove  lines  from  S,  i.e.,  £  C  £,  and 
the  following  hold: 

Every  line  in  £,  incident  from  or  to  a  target  bus  with  an  injection  meter,  has 
at  least  one  line  flow  meter  on  it. 

In  S* ,  target  lines  do  not  form  a  closed  path. 

9*  does  not  include  a  tree  7  satisfying  the  following: 

1 )  ( number  of  nodes  in  T)  >  4,  and 

2)  every  node  in  7  is  a  target  bus  with  an  injection  meter. 

5  A  line  (i,  j)  is  said  to  be  incident  from  i  and  incident  to  j. 
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Then,  among  all  undetectable  attacks,  the  state-preserving  attack  modifies  the 
smallest  number  of  meters,  which  is  the  total  number  of  line  flow  and  injection 
meters  located  on  the  target  lines  and  target  buses. 

Proof:  See  Section  3.7.  ■ 

Roughly  speaking,  the  assumptions  in  Theorem  3.3.4  hold  when  target  lines  are 
far  from  each  other  such  that  there  is  no  big  tree  in  S  consisting  solely  of  target 
buses. 

The  main  advantage  of  the  state-preserving  attack  is  that  by  preserving  the 
system  state  during  the  attack,  the  attack  can  be  launched  by  perturbing  only  local 
meters  around  the  target  lines;  hence,  only  few  data  entries  need  to  be  modified. 
Theorem  3.3.3  and  Theorem  3.3.4  supports  the  claim  by  stating  the  optimality  of 
the  state-preserving  attack  under  the  mild  assumptions.  The  theorems  also  imply 
that  the  minimum  cost  of  an  undetectable  attack  can  be  easily  characterized  if  the 
target  topology  satisfies  the  theorem  assumptions. 

Noisy  Measurement  Case 

Following  the  intuition  behind  the  state-preserving  attack  in  the  noiseless  case,  we 
will  construct  its  counterpart  for  the  noisy  measurement  case.  Recall  the  relation 
(3.8): 

(h-h)*=  /h(x)mR?)-  /u(x)m(«> 

(ij)  6£\£  (i,j)  e£\£ 

The  above  implies  that 

(H  —  £7)x  eM  =  span{m(jj)  :  (i,  j)  E  £A£}  (3.9) 
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We  set  a(z)  as  a  minimizer  of  the  J(x)-test  statistic6: 

a(z)  =  argmin  ||(z  +  d)  -  #xWLS[z  +  d]|||_i  (3.10) 

deM 

where  xwls[z  +  d]  denotes  the  WLS  state  estimate  when  the  topology  estimate 
is  S,  and  z  +  d  is  observed  at  the  control  center.  Note  that,  since  a(z)  G  M,  the 
attack  with  a  modifies  at  most  the  line  flow  measurements  of  the  target  lines  and 
the  injection  measurements  of  the  target  buses. 

Now,  suppose  that  the  adversary  modifies  breaker  state  measurements  such 
that  the  topology  estimate  becomes  S  and  simultaneously  modifies  the  meter  data 
with  a(z).  Then,  the  J(x)-test  statistic  at  the  control  center  is  upper  bounded  as 

|| (z  +  a(z))  -  Hx Wls[z  +  a(z)]|||-i 
<  IK-Hx  +  e)  -i?xWLS[iJx  +  e]|||_1, 

because  (H  —  H)x  is  an  element  of  M.  Note  that  the  right  hand  side  is  the  J(x)-test 
statistic  when  the  meter  data  are  consistent  with  the  topology  estimate  S-  Hence, 
it  has  Xm-n  distribution,  the  same  as  the  distribution  of  the  J(x)-test  statistic 
under  the  absence  of  bad  data  [40] .  This  argument  leads  to  the  following  theorem 
stating  that  this  attack  is  undetectable. 

Theorem  3.3.5  The  state-preserving  attack  a,  defined  in  (3.10),  is  undetectable. 

Note  that  xwls[z  +  d]  in  (3.10)  is  a  linear  function  of  z  +  d,  so  a(z)  can  be 
obtained  as  a  linear  weighted  least  squares  solution.  Specifically,  a(z)  has  a  form 
of  a(z)  =  D z  where  D  G  Mmxm  depends  on  S,  S,  and  S,  but  not  on  z.  Hence,  D 
can  be  obtained  off-line  before  observing  z. 

6We  use  ||r|||-i  to  denote  the  quadratic  form  r*E_1r. 
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Note  also  that  the  state-preserving  attacks  in  the  noiseless  and  noisy  cases 
modify  the  same  set  of  meters.  In  addition,  recall  that  the  condition  for  existence 
of  an  undetectable  attack  is  the  same  for  both  noiseless  and  noisy  cases.  The 
optimality  statements  for  the  state-preserving  attack  in  Theorem  3.3.3  and  Theo¬ 
rem  3.3.4  were  derived  purely  based  on  the  condition  for  undetectability.  Hence, 
the  same  optimality  statements  hold  for  the  noisy  measurement  case,  as  stated  in 
the  following  corollary,  and  the  same  interpretation  can  be  made. 


Corollary  3.3.5. 1  For  the  noisy  measurement  DC  model,  suppose  that  the  con¬ 
dition  in  Theorem  3.3.3  or  the  condition  in  Theorem  3.3.4  hold.  Then,  among  all 
undetectable  attacks,  the  state-preserving  attack  modifies  the  smallest  number  of 
meters,  which  is  the  total  number  of  line  flow  and  injection  meters  located  on  the 
target  lines  and  target  buses. 


3.4  Topology  Attack  with  Local  Information 

In  this  section,  we  consider  the  more  realistic  scenario  of  a  weak  attacker  who 
does  not  have  the  measurement  data  of  the  entire  network;  it  only  has  access 
to  a  few  meters.  The  information  available  to  the  adversary  is  local.  We  also 
generalize  the  linear  (DC)  measurement  model  to  the  nonlinear  (AC)  model.  The 
resulting  undetectable  attacks,  however,  are  limited  to  line  removal  attacks,  i.e., 
the  adversary  only  tries  to  remove  lines  from  the  actual  network  topology. 

We  first  consider  the  noiseless  measurement  case  under  the  DC  model.  Since  we 
are  restricted  to  line-removal  attacks,  £  is  a  strict  subset  of  £.  Therefore,  recalling 
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(3.8),  we  have 

(H-H)x.  =  -  Y  /<j(x)m(<j)  (3-11) 

(ij')e£\£ 

where  /^(x),  as  defined  in  Section  3.3,  denotes  the  line  flow  from  %  to  j  when  the 
line  is  connected,  and  the  state  is  x. 

Let  Zij  denote  the  measurement  of  the  line  flow  from  i  to  j.  Due  to  the  absence 
of  noise,  zl3  =  /^(x)  =  —  /^(x)  =  —z3l.  With  this  observation  and  (3.11),  we  have 

{H-H)k  =  -  Y  (3-12) 

(*J)e£\£ 

Therefore,  setting  a(z)  =  ( H  —  Lf)x,  which  is  the  state-preserving  attack,  is  equiv¬ 
alent  to  setting 

a(z)  =  ~  (3-13) 

(ij')e£\£ 

From  (3.13),  one  can  see  that  adding  the  above  a(z)  to  z  is  equivalent  to  the 
following  heuristic  described  in  Fig.  3.3: 

1.  For  every  target  line  subtract  zl3  and  z3l  from  the  injection  measure¬ 

ments  at  i  and  j  respectively. 

2.  For  every  target  line  modify  z^  and  zJt  to  0. 

This  heuristic  simply  forces  the  line  flows  through  the  target  lines,  which  are 
disconnected  in  S,  to  be  zeros,  while  adjusting  the  injections  at  the  target  buses 
to  satisfy  the  power  balance  equations  [51].  If  a  target  line  (i,j)  has  only  one  line 
flow  meter  (e.g.,  Zji ),  we  can  use  —Zji  in  the  place  of  zl3.  But,  if  some  target  line 
has  no  line  flow  meter,  this  heuristic  is  not  applicable.  Note  that  the  heuristic 
only  requires  the  ability  to  observe  and  modify  the  line  flow  measurements  of  the 
target  lines  and  the  injection  measurements  at  the  target  buses.  The  adversary 
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Figure  3.3:  Heuristic  Operations  Around  the  Target  Line  (i,j) 

can  launch  it  without  knowing  the  topology  or  network  parameters  (he.,  H  and  H 
are  not  necessary).  Since  the  heuristic  is  equivalent  to  the  state-preserving  attack, 
it  is  undetectable. 

The  same  heuristic  is  applicable  to  the  noisy  measurements  z  =  Hx.  +  e. 
To  avoid  detection,  the  adversary  can  make  a(z)  approximate  Hx  —  Hx  such 
that  z  +  a(z)  is  close  to  Hx  +  e.  Because  zVJ  =  (x)  +  e^,  z%1  is  an  unbi¬ 

ased  estimate  of  ftJ  (x) .  Similarly,  —  Y2(i  j)e£\£  zijm{i,j)  is  an  unbiased  estimate  of 
—  Yl(ij)eS.\l  /jj(x)m(i j),  which  is  equal  to  Hx  —  Hx.  Hence,  it  is  reasonable  to  set 
a(z)  =  —  X)(tj)e£\£  even  in  the  noisy  measurement  case. 

The  same  idea  is  applicable  to  the  AC  power  flow  model  with  the  nonlinear 
state  estimator.  Suppose  that  z  is  the  real  power  measurement  from  the  AC  power 
flow  model:  z  =  h(x)  +e,  where  x  is  the  vector  of  the  voltage  phasors  at  all  buses, 
and  h  is  the  nonlinear  measurement  function  for  S.  Let  h  denote  the  measurement 
function  for  S.  If  a(z)  is  equal  to  h(x)  —  h(x), 

z  =  (h(x)  +  e)  +  a(z)  =  h(x)  +  e,  (3.14) 

which  is  consistent  with  S,  so  the  attack  cannot  be  detected.  We  will  show  that 
the  attack  vector  of  the  heuristic  approximates  h(x)  —  h(x). 

For  simplicity,  assume  that  the  attacker  aims  at  removing  a  single  line  (i,j) 
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from  S-  Then,  h(x)  and  h(x)  are  different  only  in  the  entries  corresponding  to  the 
injections  at  i  and  j  and  the  line  flows  through  Specifically,  h(x)  —  h(x)  has 

all  zero  entries  except  — /i^(x)  at  the  rows  corresponding  to  the  injection  at  i  and 
the  line  flow  from  i  to  j,  and  —  hjj(x)  at  the  rows  corresponding  to  the  injection  at  j 
and  the  line  flow  from  j  to  i,  where  hl3 (x)  denotes  the  entry  of  h(x)  corresponding 
to  the  line  flow  from  i  to  j .  Since  zi3  =  hi3  (x)  +  ei3  and  z3l  =  hjj(x)  d-e^,  zi3  and  Zji 
can  be  considered  as  unbiased  estimates  of  hjj(x)  and  hji(x)  respectively.  Hence, 
the  attacker  can  use  ztj  and  Zjt  to  construct  an  unbiased  estimate  of  h(x)  —  h(x). 
Adding  this  estimate  to  z  is  equivalent  to  the  heuristic  operation  of  Fig.  3.3,  which 
subtracts  Zij  and  z3l  from  z%  and  z3  respectively,  and  sets  z%3  and  to  zeros.  The 
same  argument  holds  for  the  reactive  measurement  part  and  multiple-line  removal 
attacks.  In  practice,  the  heuristic  attack  should  be  executed  twice  separately, 
once  for  real  measurements  and  second  for  reactive  measurements.  In  Section  3.6, 
numerical  simulations  demonstrate  that  the  heuristic  attack  on  the  AC  power  flow 
model  with  the  nonlinear  state  estimation  has  a  very  low  detection  probability. 


3.5  Countermeasure  for  Topology  Attacks 

In  this  section,  we  consider  countermeasures  that  prevent  attacks  by  a  strong  ad¬ 
versary  with  global  information.  In  particular,  we  assume  that  a  subset  of  meters 
can  be  secured  so  that  the  adversary  cannot  modify  data  from  these  meters.  In 
practice,  this  can  be  accomplished  by  implementing  more  sophisticated  authenti¬ 
cation  protocols.  We  present  a  so-called  cover-up  protection  that  identifies  the  set 
of  meters  that  need  to  be  secured. 

The  algebraic  condition  in  Theorems  3.3. 1-3. 3. 2  provides  a  way  to  check 
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whether  a  set  of  adversary-controlled  meters  is  enough  to  launch  an  undetectable 
attack.  Restating  the  algebraic  condition,  there  exists  an  undetectable  attack  with 
the  subspace  A  of  feasible  attack  vectors,  if  and  only  if  Col (H)  C  Col (H,  A)  for 
some  9  (different  from  9)- 

Let  Us  denote  the  set  of  indices  for  the  entries  of  z  corresponding  to  the  pro¬ 
tected  meters.  Then,  A  is  {c  G  Mm  :  c,  =  0,  i  G  Js}.  The  objective  of  the  control 
center  is  to  make  any  undetectable  attack  infeasible  while  minimizing  the  cost  of 
protection  (he.,  minimizing  |Jg|  or  equivalently,  maximizing  the  dimension  of  A). 

To  achieve  the  protection  goal,  A  should  satisfy  that  for  any  target  topology 
9,  Col (H)  Col(//,  A).  However,  finding  such  A  by  checking  the  conditions  for 
all  possible  targets  is  computationally  infeasible.  To  avoid  computational  burden, 
the  following  theorem  gives  a  simple  graph-theoretical  strategy. 

Theorem  3.5.1  (Cover-up  strategy)  Let  £  and  £ o  denote  the  undirected  coun¬ 
terparts  of  £  and  £0  respectively.  Fori  e  V,  let  denote  the  set  of  edges  in  (V,  £0) 
that  are  incident  to  i. 

Suppose  there  is  a  spanning  tree  7  =  (V,  £T)  of  (V,  £)  (the  current  topology) 
and  a  vertex  subset  B  (B  C  V)  that  satisfies 

£tU  =  £o-  (3.15) 

Then,  if  we  protect  (i)  one  line  flow  meter  for  each  line  in  £7  and  (ii)  the 
injection  meters  at  all  buses  in  B,  an  undetectable  attack  does  not  exist  for  any 
target  topology. 

Proof:  See  Section  3.7.  ■ 
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The  condition  (3.15)  means  that  the  edges  of  7  and  the  edges  incident  to  vertices 
in  B  can  cover  all  the  lines  (both  connected  and  disconnected)  of  the  grid.  One 
can  easily  find  such  7  and  B  using  available  graph  algorithms. 

Fig.  3.4  describes  a  cover-up  strategy  for  IEEE  14-bus  system.  The  strategy 
used  the  spanning  tree  7  marked  by  red  dash  lines,  and  B  =  {1,  4,  13}.  The 
unprotected  meters  and  protected  meters  are  marked  by  black  rectangles  and  blue 
circles  respectively.  In  this  example,  the  strategy  requires  protection  of  30%  of 
meters.  In  addition,  numerically  checking  the  algebraic  condition  showed  that  if 
the  control  center  removes  any  of  the  protections,  the  grid  becomes  vulnerable  to 
undetectable  topology  attacks.  This  suggests  that  the  strategy  does  not  require 
protection  of  an  excessive  number  of  meters.  For  IEEE  118-bus  system,  a  cover-up 
strategy  required  protection  of  31%  of  meters. 

The  cover-up  strategy  also  prevents  undetectable  state  attacks  [18].  It  follows 
from  Theorem  1  in  [24],  which  states  that  an  undetectable  state  attack  does  not 
exist  if  and  only  if  the  secure  meters,  protected  by  the  control  center,  make  the 
system  state  observable.  Because  the  strategy  protects  one  line  meter  for  each  line 
in  the  spanning  tree  7,  the  system  state  is  always  observable  with  the  protected 
meters  [26]. 


3.6  Numerical  Results 

We  first  present  practical  uses  of  the  algebraic  condition  for  undetectable  attacks. 
Then,  we  test  the  proposed  attacks  with  IEEE  14-bus  and  118-bus  systems,  and 
present  their  effect  on  real-time  LMPs. 
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THREE  WINOING 
TRANSFORMER  EQUIVALENT 


Figure  3.4:  Rectangles  (or  circles)  on  buses  and  lines  represent  injection  meters 
and  line  flow  meters  respectively.  We  assume  that  £  =  £o-  The  attacker  may 
attempt  to  remove  lines  from  S- 

3.6.1  Application  of  Undetectability  Condition 


In  Section  3.3.1,  the  necessary  and  sufficient  algebraic  condition  is  given  to  check 
whether  an  adversary  can  launch  an  undetectable  attack  for  a  target  S  with  a 
subspace  A  of  feasible  attack  vectors.  Here,  we  provide  examples  of  how  the 
condition  can  be  used  by  both  attackers  and  the  control  center. 

Suppose  that  an  attacker  with  global  information  aims  to  remove  a  specific 
set  of  lines  from  the  topology.  In  Section  3.3.1,  we  have  shown  that  the  state¬ 
preserving  attack  requires  the  smallest  dimension  of  A  among  undetectable  attacks 
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Table  3.1:  The  adversary-controlled  meters  for  the  attacks  to  remove  lines  (2,4) 
and  (12,13):  i  — >  j  denotes  the  meter  for  the  line  flow  from  bus  i  to  bus  j.  i 
denotes  the  injection  meter  at  bus  i. 


Adversary-controlled  meters 

State-preserving 

attack 

2  4,  4  ->  2,  12  ->  13, 

13  ->■  12,  2,  4,  12,  13 

Alternative  1 
(not  modifying  12) 

2  — ►  4,  4  — ►  2,  12  — ►  13,  13  ->■  12, 

6  12,  12  ->■  6,  2,  4,  6,  13 

Alternative  2 
(not  modifying  4) 

2  — )•  4,  4  — )•  2,  12  — >•  13,  13  ^  12,  2  ^  3, 

3  — ?•  2,  3  — ?•  4,  4  — )•  3,  2,  3,  12,  13 

under  mild  conditions.  If  the  conditions  are  met  and  the  attacker  can  perform 
the  necessary  meter  modifications,  the  state-preserving  attack  can  be  launched 
with  the  guaranteed  optimality.  However,  if  the  attacker  cannot  perform  some 
meter  modification  required  by  the  state-preserving  attack,  it  should  search  for  an 
undetectable  alternative  with  a  reasonably  small  dimension  for  A.  The  algebraic 
condition  can  be  used  to  find  such  an  alternative7.  For  instance,  for  a  line-removal 
attack  on  the  IEEE  14-bus  network  in  Fig.  3.4,  Table  3.1  shows  some  alternatives 
to  the  state-preserving  attack  when  the  attacker  cannot  modify  some  injection 
meter. 


When  the  set  of  adversary-controlled  meters  is  fixed,  the  algebraic  condition 
can  be  exploited  to  find  the  target  topologies,  for  which  the  attacker  can  launch 
undetectable  attacks.  For  instance,  in  the  IEEE  14-bus  network  in  Fig.  3.4,  assume 

that  the  attacker  can  modify  the  data  from  the  injection  meters  at  11,  12,  and 

7One  heuristic  way  to  find  an  alternative,  which  we  employed,  is  to  begin  with  a  large  set  X  of 
adversary-controlled  meters  that  satisfies  the  algebraic  condition  and  the  constraint  ( e.g .,  exclude 
a  certain  injection  meter)  and  remove  meters  from  X  one  by  one  such  that  after  each  removal  of 
a  meter,  X  still  satisfies  the  algebraic  condition.  If  no  more  meter  can  be  removed,  we  take  X 
as  an  alternative.  The  final  set  depends  on  the  initial  X  and  the  sequence  of  removed  elements. 
One  can  try  this  procedure  multiple  times  with  different  initial  3Cs  and  removal  sequences,  and 
pick  the  one  with  the  smallest  size. 
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Table  3.2:  The  Sets  of  Lines  Undetectable  Attacks  Can  Remove 


|£A£| 

£A£  (lines  to  be  removed  by  the  attack) 

1 

{(6,12)},  {(6,11)},  {(10,11)},  {(9,10)}, 
{(9,14)},  {(13,14)},  {(12,13)} 

2 

{(10, 11),  (13, 14)},  {(9, 14), (12, 13)},  {(9, 10), (13, 14)}, 
{(6, 12),  (13, 14)},  {(6, 12),  (10, 11)},  {(6, 12),  (9, 10)}, 
{(6, 11),  (12, 13)},  {(6, 11), (9, 14)} 

3 

{(6, 11),  (9, 14),  (12, 13)},  {(6, 12),  (9, 10),  (13, 14)}, 
{(6, 12), (10, 11),  (13, 14)} 

14,  and  all  the  line  flow  meters  on  (6,12),  (6,11),  (10,11),  (9,10),  (9,14),  and 
(13, 14).  Then,  numerically  checking  the  algebraic  condition  show  that  the  attacker 
cannot  launch  an  undetectable  attack  for  any  target.  However,  if  the  attacker  can 
additionally  control  the  line  flow  meters  on  (12, 13),  it  can  launch  an  undetectable 
attack  to  remove  any  set  of  lines  listed  in  Table  3.2  from  the  current  topology. 

The  control  center  can  also  utilize  the  algebraic  condition  to  decide  which  me¬ 
ters  to  put  more  security  measures  on.  For  instance,  in  the  IEEE  14-bus  network, 
suppose  that  the  control  center  protects  all  the  injection  meter.  In  the  worst 
case,  the  attacker  may  be  able  to  modify  all  the  line  flow  measurements.  In  this 
case,  checking  the  algebraic  condition  shows  that  the  attacker  can  launch  an  un¬ 
detectable  line-removal  attack  for  any  target  topology,  as  long  as  the  system  with 
the  target  topology  is  observable.  However,  checking  the  algebraic  condition  also 
shows  that  if  the  control  center  can  additionally  protect  any  line  flow  meter,  an 
undetectable  attack  does  not  exist  for  any  target.  Therefore,  it  is  worthwhile  for 
the  control  center  to  make  an  effort  to  secure  one  more  line  flow  meter. 
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3.6.2  Undetectability  and  Effects  on  Real-time  LMP 

We  tested  the  state-preserving  attack  with  global  information  and  the  heuristic 
with  local  information  on  IEEE  14-bus  and  IEEE  118-bus  system,  and  investigated 
their  effect  on  real-time  LMPs.  The  AC  power  flow  model  and  nonlinear  state 
estimation  were  used  to  emulate  the  real-world  power  grid. 

For  simulations,  we  first  assigned  the  line  capacities,  generation  limits,  and 
estimated  loads,  and  obtained  the  day-ahead  dispatch.  Then,  we  modeled  the 
voltage  magnitudes  and  phases  of  buses  as  Gaussian  random  variables  centered 
at  the  system  state  for  the  day-ahead  dispatch,  with  small  variances.  In  each 
Monte  Carlo  run,  we  generated  a  state  vector  from  the  distribution  and  used  the 
nonlinear  AC  power  flow  model8  with  Gaussian  measurement  noise  to  generate 
the  noisy  measurements.  The  attacker  observed  the  noisy  measurements,  added 
the  corresponding  attack  vector  to  them,  and  passed  the  corrupt  measurements 
to  the  control  center.  The  control  center  employed  the  nonlinear  state  estimator 
to  obtain  the  residue  and  performed  the  J(x)-test  with  the  residue.  If  J(x)-test 
failed  to  detect  the  attack,  the  real-time  LMPs  were  calculated  based  on  the  state 
estimate. 

In  simulations,  we  assumed  that  the  attacker  aims  to  remove  a  single  line  from 
the  topology.  Fig.  3.5  presents  the  detection  probability  of  the  proposed  attacks 
on  IEEE  14-bus  system,  for  different  target  lines.  The  attacks  on  most  target 
lines  succeeded  with  low  detection  probabilities,  close  to  the  false  alarm  constraint 
0.1.  Table  3.3  shows  the  detection  probability  averaged  over  all  possible  single-line 

8In  simulations,  we  have  reactive  measurements,  which  were  not  considered  in  our  analysis  of 
the  state-preserving  attack.  We  simply  applied  the  same  analysis  for  the  reactive  components  of 
the  linearlized  decoupled  model  [51]  and  derived  the  reactive  counterpart  of  the  state-preserving 
attack. 
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Detection  probability  of  topology  attacks  (False  alarm  const.  =  0.1) 


Figure  3.5:  The  x-axis  is  for  the  index  of  the  target  line.  Measurement  noise 
standard  deviation  is  0.5  p.u.,  and  1000  Monte  Carlo  runs  are  used. 


Table  3.3:  1000  Monte  Carlo  runs  are  used. 


14-bus 

118-bus 

false  alarm  const,  a 

a  =  0.1 

a  =  0.01 

a  =  0.1 

a  =  0.01 

state-preserving 

0.061 

0.009 

0.075 

0.005 

heuristic 

0.105 

0.019 

0.095 

0.009 

removal  attacks.  In  both  IEEE  14-bus  and  118-bus  systems,  the  proposed  attacks 
were  hardly  detected.  In  most  cases,  detection  probabilities  were  as  low  as  the  false 
alarm  rates.  The  performance  of  the  heuristic  was  remarkably  good,  considering 
that  it  only  requires  to  observe  and  control  few  local  data. 

We  also  examined  the  absolute  perturbation  of  the  real-time  LMPs  (see  [36] 
for  real-time  LMP).  The  parameters  in  the  real-time  LMP  calculation  include  the 
estimated  set  of  congested  lines  and  the  shift-factor  matrix;  both  depend  on  the 
topology  estimate.  Hence,  we  expect  that  topology  attacks  would  disturb  the  real¬ 
time  LMP  calculation.  In  our  simulations,  both  the  state-preserving  attack  and 
the  heuristic  perturbed  the  real-time  LMPs  by  10%  on  average  for  IEEE  14-bus 
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system  and  3.3%  for  IEEE  118-bus  system.  In  the  118-bus  system,  attacks  on 
some  target  lines  had  effects  on  only  the  buses  near  the  target  lines,  so  the  average 
perturbation  was  lower  than  the  14-bus  case. 


3.7  Proofs 

3.7.1  Proof  of  Theorem  3.3.2 

The  if  statement  can  be  proved  by  constructing  an  undetectable  attack  following 
the  arguments  used  to  prove  Theorem  3.3.1  and  Theorem  3.3.5.  Due  to  the  space 
limit,  we  only  provide  the  proof  of  the  only  if  statement. 

Let  a  be  any  attack  with  Col (H)  ^  Col (H,  If)  where  11  =  {ui, . . . ,  ur-}  denotes 
the  basis  of  A  consisting  of  unit  vectors  in  Mm  and  U  G  MmxA  is  the  matrix  having 
the  vectors  in  'll  as  its  columns.  Without  loss  of  generality,  we  assume  that  the 
columns  of  H  and  the  unit  vectors  in  IX  are  linearly  independent;  if  not,  we  can 
just  work  with  a  smaller  set  of  If  satisfying  the  independence  condition. 

Because  Col(iJ)  ^  Col (H,  If),  Col(//)nCol(/7,  If)  is  a  subspace  of  Col (H)  with 
a  strictly  smaller  dimension.  Hence,  §  =  {x  6  R"  :  Hx  G  Col  (If)  D  Col  (H,  IX)} 
has  the  dimension  less  than  n  and  thus  a  zero  Lebesgue  measure  in  Mn.  Let  x  be 
an  arbitrary  element  of  W1  \  S.  Then,  y  =  i/x  ^  Col (H,  'll).  When  x  is  the  true 
state,  z  =  y  +  e,  and  the  J(x)-test  statistic  for  a  is 

J  =\\W(y  +  e  +  a(y  +  e))||E-i 

where  W  =  I  —  H H)~l .  Since  a(z)  G  Col (IX)  for  all  z,  J  is  lower 
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bounded  by 

K 

L  =  min  ||W(y  +  e  +  V]  afcufc)||s-i. 

K)f=1  tt 

The  minimization  in  L  is  achieved  by  the  linear  WLS  solution,  and  one 
can  show  that  L  =  {W{ y  +  e))*£_1IU(y  +  e)  where  W  =  W  — 
(WU)[{WU)tY.-1(WU)}-1(WU)tY.-lW .  W  and  W  are  idempotent  and  Yr'W 
is  symmetric.  Using  these  properties,  one  may  derive  that 

L  =  (S“3(y  +  +  e)). 

The  above  quadratic  form  has  the  following  properties:  (i)  is  idempotent 

and  symmetric,  (ii)  £“3(y  -f  e)  A/"(S  2  y,  /m),  and  (iii)  rank(S2fTtS  2)  = 
m  —  n  —  K.  With  these  three  properties,  Theorem  B.33  and  Theorem  1.3.3  in  [75] 
imply  that  L  has  the  noncentral  chi-squared  distribution  with  the  (m  —  n  —  K) 
degree  of  freedom  and  the  noncentral  parameter  A  =  (lUy)*S_1(lUy). 

It  can  be  shown  that  y  ^  Col(77,  IX)  implies  lUy  7^  0.  Hence,  if  the  diagonal 
entries  of  E  (denoted  by  cb),  1  <  i  <  m)  uniformly  decrease  to  0,  then  A  = 
grows  to  infinity.  Suppose  that  the  J(x)-test  uses  a  threshold  r. 
The  detection  probability  of  the  attack  is  Pr(J  >  r),  and  it  is  lower  bounded  by 
Pr(L  >  r).  And,  Pr(L  >  r)  approaches  1  as  the  noncentral  parameter  A  grows 
to  infinity.  Therefore,  if  the  diagonal  entries  of  £  (he.,  noise  variances)  uniformly 
decreases  to  0,  then  A  grows  to  infinity  and  Pr(J  >  r)  approaches  1.  Hence,  the 
only  if  statement  and  the  additional  statement  are  proved.  ■ 

3.7.2  Proof  of  Theorem  3.3.3 

Let  £A£  =  {(a,  6)}.  We  prove  the  statement  for  the  case  that  the  attack  removes 
(a,  6),  and  there  are  two  line  flow  meters  on  (a,  b)  (one  for  each  direction)  and 
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injection  meters  at  both  a  and  b.  For  the  line  addition  attack  and  other  meter 
availabilities,  the  similar  argument  can  be  made. 

Suppose  there  exists  an  undetectable  attack  with  A,  and  let  11  =  {u!, . . . ,  u^} 
denote  the  basis  of  A  consisting  of  unit  vectors  in  Mm.  Theorem  3.3.1  implies 
Col(iL)  C  Col(iL,  A).  It  can  be  easily  verified  that  m(ai&)  G  Col(i/,  A),  and  this 
implies  m(ai&)  =  F/x  +  Yhk=i  afcufc  for  some  x  G  Mn  and  (etk)k=i  £  MA.  Then, 
m  =  m(a)b)  -  J2k=i  akuk  G  Col(iJ). 

Let  m*-7  (m1)  denote  the  row  entry  of  rh  corresponding  to  the  line  flow  from  i  to 
j  (the  injection  at  i)  and  upj)  (up))  denote  the  m-dimensional  unit  vector  with  1 
at  the  row  corresponding  to  the  line  flow  from  i  to  j  (the  injection  at  i).  Physically, 
rh  G  Col(iJ)  means  that  m  is  a  vector  of  meter  data  consistent  with  the  topology 
S-  It  implies  that  (i)  mab  and  fhba  are  zeros,  since  (a,b)  is  disconnected  in  9,  and 
(ii)  the  Kirchhoff’s  current  laws  (KCL)  should  hold  at  bus  a  and  b  in  9,  be.,  the 
sum  of  all  outgoing  line  flows  from  a  should  be  equal  to  the  injection  amount  at 
a.  Using  the  special  structure  of  m(ajb)  and  rh,  the  following  can  be  proved.  From 
(i),  one  can  prove  that  U(aj6),  U(bia)  G  IX.  From  (ii),  one  can  show  that  If  should 
include  U(a)  or  some  U(a)fc)  (or  U(fc)a))  with  a  and  k  connected  in  9-  Similarly,  'll 
should  include  U(b)  or  some  u^,i)  (or  U(jjb))  with  b  and  l  connected  in  9-  Hence,  |U| 
is  no  less  than  the  total  number  of  meters  located  on  the  target  line  (a,  b )  and  the 
target  buses  a  and  b.  ■ 

3.7.3  Proof  of  Theorem  3.3.4 

Suppose  a  is  an  undetectable  attack  with  A  for  the  target  topology  9  satisfying 
the  theorem  conditions.  Let  If  =  {ui, . . . ,  u/^}  be  the  basis  of  A  consisting  of  unit 
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vectors  in  Mm,  and  3  C  V  denote  the  set  of  target  buses  with  injection  meters. 
For  ease  of  presentation,  we  assume  that  each  target  line  (i,j)  has  two  line  flow 
meters,  one  for  each  direction.  For  other  meter  availabilities,  the  similar  argument 
can  be  made. 

Theorem  3.3.1  implies  that  Col (H)  C  Col(ff,  11).  ft  can  be  easily  shown  that  if 
the  target  lines  do  not  form  a  closed  path  in  S,  then  Col(U)  C  Col(Ff,  11)  implies 
that  rri(jj)  G  Col (H,  11)  for  all  target  lines  (i,j)  G  £  \  £. 

mpj)  G  Col(H,  IX)  means  that  it  is  possible  to  find  a  linear  combination  of 
vectors  in  U,  Y^k=i  akuk,  such  that  rhpj)  =  m(  i,j)  +  Y,k=iakUk  e  Col(XL).  m pj)  G 
Col (H)  implies  that  (i)  the  row  entries  of  mpj)  corresponding  to  the  line  flows  of 
the  disconnected  lines  in  9  are  zeros,  and  (ii)  the  entries  of  rhpj)  satisfy  KCLs  at 
all  buses  in  9- 

For  each  (i,j)  G  £  \  £,  since  (i,j)  is  disconnected  in  9,  rli^ ^  =  mT  .j  =  0.  On 
the  other  hand,  rri^  ^  =  1  and  ^  =  —1.  Hence,  11  should  include  upj)  and 
u^j).  Therefore,  11  should  contain  (upj),  u^i)  :  (i,j)  G  £  \  £}. 

For  each  i  G  3,  the  assumptions  imply  that  each  line  adjacent  to  i  in  S  has  at 
least  one  line  flow  meter.  We  let  n*  denote  the  set  of  the  line  flow  meters  on  the 
lines  incident  to  i  in  9,  and  denote  the  vector  of  the  corresponding  entries  in 
rr i(ij).  Because  mpj)  has  nonzero  entries  only  for  the  injections  at  i  and  j  and  the 
line  flows  through  (i,  j),  m"v ;  has  all  zero  entries.  On  the  other  hand,  m*t  ^  =  1. 
Hence,  for  mpy)  to  satisfy  the  KCL  at  bus  i  in  9,  at  least  one  of  mb ^  or  entries 
of  mhJ)  has  to  be  modihed  by  XIaLi  akuk-  Thus,  11  should  contain  up)  or  U(Qjb) 
for  some  (a,  b)  G  rij. 

In  case  that  up)  ^  11,  for  mpp)  to  satisfy  the  KCL  at  bus  i  in  9,  at  least  one 
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entry  of  mj^  should  have  a  nonzero  value:  suppose  takes  a  nonzero  value. 

If  k  G  3,  we  can  make  a  similar  argument  based  on  the  KCL  at  k:  'll  should  contain 
U(fc)  or  U(a>5)  for  some  (a,  b )  G  n(/c)  \  {(z,  k),  (k,  z)}.  Following  this  line  of  argument, 
we  can  derive  that  for  each  i  G  3,  H  should  contain  unit  vectors  corresponding 
to  at  least  one  of  the  following  sets:  (i)  injection  meter  at  z,  (ii)  line  flow  meters 
on  all  the  lines  in  some  path  (z,  v2,  ■  ■  ■ ,  vn)  in  S*  and  injection  meter  at  vn  where 
V2,  ■  ■  ■ ,  vn  G  3,  or  (iii)  line  flow  meters  on  all  the  lines  in  some  path  (z,  v2, ,  vn) 
in  S*  where  v2, . . . ,  vn-\  G  3  and  vn  is  either  equal  to  one  of  {v2, . . . ,  vn~i }  or  not 
in  3-  For  each  i  G  3,  If  should  contain  at  least  one  set  of  unit  vectors  corresponding 
to  any  of  the  above  three  cases:  we  let  S*  to  denote  an  arbitrary  one  of  such  sets. 


Note  that  (upj),  u^j)  :  (i,j)  G  £  \  £}  does  not  overlap  with  UiegSj.  Hence, 
|U|  >  I  Uiea  Si  I  +  Ku^jU^j)  :  (i,j)  G  £  \  £}|.  Proving  |  UiGa  S*|  >  \3\  gives  us 
the  theorem  statement,  because  \3\  +  |{upj),  U(yq  :  (i,j)  G  £  \  £}|  is  the  exact 
number  of  meters  the  state-preserving  attack  modifies. 


We  will  prove  the  following  statement  for  all  n  <  |J|,  by  mathematical  induc¬ 
tion:  for  any  subset  3  C  3  with  \3\  =  n,  \  Ui£g  S*|  >  n.  For  n  —  1,  2,  3,  the 
statement  can  be  easily  verified.  Suppose  the  statement  is  true  for  all  n  <  k 

(k  >  3),  and  3  is  an  arbitrary  subset  of  3  with  \3\  =  k  +  1.  The  tree  con¬ 
dition  guarantees  that  3  can  be  partitioned  into  two  nonempty  sets  3 1  and 

such  that  for  any  b\  G  3\  and  b2  G  3 2,  every  path  in  S*  between  b\  and  b2  con¬ 

tains  a  node  not  in  3-  This  implies  that  Sj,  and  U beg2Sb  are  disjoint.  By 
the  induction  hypothesis,  we  have  |  S& |  >  \3i\  and  |  U §&|  >  \32\.  Thus, 

I  u6e 3  S&|  =  |  Sb\  +  |  Ufeeg2  S6|  >  \3i\  +  IJ2I  =  \3\-  Therefore,  the  induction 
implies  |  Uieg  Sj|  >  \3\,  and  the  theorem  statement  follows.  ■ 
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3.7.4  Proof  of  Theorem  3.5.1 

Suppose  meters  are  protected  as  described  with  7  and  23.  Let  A  be  the  resulting 
subspace  of  feasible  attack  vectors  and  U  =  {ui, . . . ,  u^}  denote  the  basis  of  A 
consisting  of  unit  vectors  in  Mm.  Assume  that  an  undetectable  attack  can  be 
launched  for  some  target  topology  9  (different  from  9)-  We  will  show  that  this 
assumption  leads  to  a  contradiction. 

Note  that  'll  cannot  contain  the  unit  vectors  corresponding  to  the  protected 
measurements.  In  addition,  Theorem  3.3.2  implies  that  Col(H)  C  Col (H,  'll). 
These  two  imply  that  the  lines  in  £j  cannot  be  removed  by  the  attack,  because 
each  line  has  a  protected  line  flow  meter. 

Let  H  ( H )  denote  the  submatrix  of  H  (H)  obtained  by  selecting  the  rows 
corresponding  to  the  protected  meter  measurements.  One  can  easily  verify  that 
Col (H)  C  Col(iL,  IX)  if  and  only  if  Col (H)  C  Col (H).  Hence,  we  have  Col (H)  C 
Col(iL) .  This  means  that  for  all  x  £  Rn,  there  exists  y  €  IRC  such  that  Hy  =  Hx. 
Let  Hj  denote  the  submatrix  of  H  obtained  by  selecting  the  rows  corresponding 
to  the  protected  line  flow  meters  on  the  spanning  tree  7.  Since  the  lines  in  £j 
cannot  be  removed  by  the  attack,  the  Hy  part  of  H  remains  the  same  in  H ;  hence, 
Hj  is  also  a  submatrix  of  H.  Thus,  Hy  =  i/x  implies  H? y  =  H-jx..  Since  7  is 
a  spanning  tree  and  it  has  one  protected  line  flow  meter  per  line,  the  protected 
line  meters  on  T  makes  the  grid  observable  [26].  Hence,  Hj  has  full  column  rank. 
Consequently,  Hjy  =  //rx  implies  y  =  x,  and  we  have  //x  =  //x.  This  holds  for 
all  x  £  M”. 

Let  a  be  any  element  in  23.  We  will  show  that  any  line  in  La  cannot  be  a 
target  line.  Note  that  the  injection  meter  at  a  is  protected,  so  H  and  H  have 
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the  row  corresponding  to  the  injection  at  a.  £/x  =  £/x  for  all  x  G  M”  implies 
that  the  injection  at  bus  a  should  be  the  same  for  S  and  S  as  long  as  the  state 
is  the  same  for  the  two  cases.  When  the  state  is  x,  the  injection  at  a  in  S  is 
Sfc:{a,fc}e£  Bak{xa  ~  xk),  and  the  injection  at  a  in  S  is  iZ}e|  Bai(xa  -  x{).  Thus 
we  have, 

y  Bak{xa-xk)  =  ^2  Bai(xa  -  xi),  Vx  G  Mn, 

k-.{a,k}el  k.{a,l}£l 

which  can  be  rewritten  as  follows:  for  all  x  G  M”, 

^  Bak(xa  -  xfc)  -  ^  Bai(xa  -  xi)  =  0. 

fc:{a,fc}e£\l  l:{a,l}el\l 

If  £a  fl  (£A£)  is  not  empty,  the  above  statement  is  true  only  when  Bak  =  0 
for  all  {a,  k}  G  La  fl  (£A£).  Bak  is  the  susceptance  of  the  line  {a,  k]  when  it  is 
“connected” ,  and  this  value  is  nonzero  in  practice  for  every  line.  Hence,  £afl(£A£) 
should  be  empty;  he.,  a  line  in  cannot  be  a  target  line. 

It  was  shown  that  the  lines  in  T  and  Uaes£a  cannot  be  a  target  line.  Thus,  the 
condition  (3.15)  implies  that  no  line  can  be  a  target  line,  and  this  contradicts  the 
assumption  that  there  exists  an  undetectable  topology  attack.  ■ 
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CHAPTER  4 

DATA  FRAMING  ATTACK  ON  STATE  ESTIMATION 

4.1  Introduction 

A  promising  feature  of  a  future  smart  grid  is  the  data-driven  approach  to  auto¬ 
mated  monitoring,  control,  and  decision.  The  paradigm  shift  to  a  data-driven 
framework  enables  deeper  integration  of  data  collection  and  sophisticated  data 
processing.  While  extracting  actionable  information  from  real-time  sensing  data 
can  make  the  grid  more  efficient  and  adaptive  to  real-time  operating  conditions,  it 
exposes  the  grid  to  possible  cyber  data  attacks  aimed  at  disrupting  grid  operations 
and  potentially  causes  blackouts. 

In  [18],  Liu,  Ning,  and  Reiter  presented  perhaps  the  first  man-in-the-middle 
(MiM)  attack  on  the  power  system  state  estimation  where  an  adversary  replaces 
“normal”  sensor  data  with  “malicious  data.”  It  was  shown  that,  if  the  adversary 
could  gain  control  of  a  sufficient  number  of  meters,  it  could  perturb  the  state 
estimate  by  an  arbitrary  amount  without  being  detected  by  the  bad  data  detector 
employed  at  the  control  center.  Such  undetectable  attacks  are  referred  to  as  covert 
data  attacks. 

The  condition  under  which  covert  data  attacks  are  possible  is  found  to  be 
equivalent  to  that  of  system  observability.  In  particular,  covert  attacks  are  possible 
if  and  only  if  the  system  becomes  unobservable  when  the  meters  under  attack 
are  removed  [24]  (or  equivalently,  the  adversary  controls  a  critical  set  of  meters.) 
Therefore,  the  minimum  number  of  meters  that  an  adversary  has  to  control  in 
order  to  launch  a  covert  data  attack,  referred  to  as  a  security  index,  is  an  important 
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measure  of  security  against  data  attack.  It  represents  a  fundamental  limit  on  the 
capability  of  an  adversary  to  disrupt  covertly  the  operation  of  the  grid  [20,24], 

In  this  chapter,  we  show  that  the  barrier  on  the  capability  of  an  adversary 
can  be  circumvented  by  using  a  different  form  of  attacks,  one  that  exploits  the 
vulnerabilities  of  the  existing  bad  data  detection  and  removal  mechanisms.  In 
particular,  we  show  that  the  adversary  only  needs  to  gain  control  of  about  half  of 
the  meters  required  by  the  security  index  while  achieving  the  same  objective  of 
perturbing  the  state  estimate  by  an  arbitrary  amount  without  being  detected  by 
the  control  center. 

The  attacks  considered  in  this  chapter  are  referred  to  as  data  framing  attacks , 
borrowing  the  notion  of  framing  as  providing  false  evidence  to  make  someone 
innocent  appear  to  be  guilty  of  misconducts.  In  the  context  of  state  estimation, 
a  data  framing  attack  means  that  an  adversary  launches  a  data  attack  in  such  a 
way  that  the  control  center  detects  the  presence  of  bad  data  and  identifies  normal 
meters  as  sources  of  bad  data.  To  this  end,  the  attacker  does  not  try  to  make 
malicious  data  pass  the  bad  data  detection  (as  a  covert  attack  tries  to  do).  Instead, 
it  purposely  triggers  the  bad  data  detection  and  causes  erroneous  removal  of  good 
data.  Unknown  to  the  control  center,  the  remaining  data  still  contain  adversary- 
injected  malicious  data,  causing  errors  in  the  state  estimate. 

4.1.1  Summary  of  Results  and  Organization 

We  propose  a  data  framing  attack  on  power  system  state  estimation.  Specifi¬ 
cally,  we  formulate  the  design  of  optimal  data  framing  attack  as  a  quadratically 
constrained  quadratic  program  (QCQP).  To  analyze  the  efficacy  of  the  data  fram- 
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ing  attack,  we  present  a  sufficient  condition  under  which  the  framing  attack  can 
achieve  an  arbitrary  perturbation  of  the  state  estimate  by  controlling  only  half 
of  the  critical  set  of  meters.  We  demonstrate  with  the  IEEE  14-bus  and  118-bus 
networks  that  the  sufficient  condition  holds  in  critical  sets  associated  with  cuts. 

The  optimal  design  of  framing  attack  is  based  on  a  linearized  system.  In  prac¬ 
tice,  a  nonlinear  state  estimator  is  often  used.  We  demonstrate  that,  under  the 
nonlinear  measurement  model,  the  framing  attacks  designed  based  on  linearized 
system  model  successfully  perturb  the  state  estimate,  and  the  adversary  can  con¬ 
trol  the  degree  of  perturbation  as  desired. 

The  rest  of  the  chapter  is  organized  as  follows.  Section  4.2  introduces  the 
measurement  and  adversary  models  with  preliminaries  on  state  attacks.  Section  4.3 
presents  the  mathematical  model  of  state  estimation  and  bad  data  processing.  In 
Section  4.4,  we  present  the  main  idea  of  the  data  framing  attack  and  the  QCQP 
framework  for  the  attack  design.  Section  4.5  provides  a  theoretical  justification  of 
the  efficacy  of  the  data  framing  attack.  In  Section  4.6,  we  test  the  data  framing 
attack  with  the  IEEE  14-bus  and  118-bus  networks. 


4.2  Mathematical  Models 

This  section  introduces  the  topology  and  system  state  of  a  power  network,  the 
meter  measurement  model,  and  the  adversary  model.  In  addition,  the  covert  state 
attack  and  its  connection  with  network  observability  are  explained.  Throughout 
the  chapter,  boldface  lower  case  letters  (e.<?.,x)  denote  vectors,  x,  denotes  the  ith 
entry  of  the  vector  x,  boldface  upper  case  letters  (e.g..  H)  denote  matrices,  H^ 
denotes  the  (i,  j)  entry  of  H,  !R(H)  denotes  the  column  space  of  H.  N(H)  denotes 
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the  null  space  of  H.  and  script  letters  (e.g.,  J,  A)  denote  sets.  The  multivariate 
normal  distribution  with  the  mean  /x  and  the  covariance  matrix  S  is  denoted  by 

•A/-Q u,E). 


4.2.1  Network  and  Measurement  Models 

A  power  network  is  a  network  of  buses  connected  by  transmission  lines,  and  thus 
the  topology  of  the  grid  can  be  naturally  defined  as  an  undirected  graph  S  =  (V,  £) 
where  V  is  the  set  of  buses,  and  £  is  the  set  of  lines  connecting  buses  G  £  if 

and  only  if  bus  i  and  bus  j  are  connected.)  The  system  state  of  the  power  network 
is  defined  as  the  vector  of  bus  voltage  magnitudes  and  phase  angles,  from  which 
all  the  other  quantities  (e.g.,  power  line  flows,  power  injections,  line  currents)  can 
be  calculated. 

For  real-time  estimation  of  the  system  state,  the  control  center  collects  mea¬ 
surements  from  line  flow  and  bus  injection  meters1  deployed  throughout  the  grid. 
The  meter  measurements  are  related  to  the  system  state  x  in  a  nonlinear  fashion, 
and  the  relation  is  described  by  the  AC  model  [51]: 

z  =  /i(x)  +  e,  (4.1) 

where  h(-)  is  the  nonlinear  measurement  function,  and  e  is  the  Gaussian  measure¬ 
ment  noise  with  a  diagonal  covariance  matrix. 

If  some  of  the  meters  malfunction  or  an  adversary  injects  malicious  data,  the 
control  center  observes  biased  measurements, 

z  =  h(x)  +  e  +  a,  (4.2) 

1  Other  types  of  meters  can  also  be  considered,  but  we  restrict  our  attention  to  line  flow  and 
bus  injection  meters  for  simplicity. 


97 


113 


where  a  represents  a  deterministic  bias.  In  such  a  case,  the  data  are  said  to  be 
bad ,  and  the  biased  meter  entries  are  referred  to  as  bad  data  entries.  Note  that 
even  when  a  meter  is  protected  from  adversarial  modification,  it  may  still  have  a 
bias  due  to  a  physical  malfunction  or  improper  parameter  setting;  filtering  out  the 
measurements  from  such  malfunctioning  meters  was  the  original  objective  of  the 
legacy  bad  data  processing  and  is  adopted  in  practice  today  [40]. 

Even  though  the  model  in  (4.1)  is  nonlinear,  the  state  estimate  is  generally 
obtained  by  iterations  of  weighted  linear  least  squares  estimation  with  the  locally 
linearized  model  [51].  Therefore,  it  is  reasonable  to  analyze  the  performance  of 
state  estimation  using  the  locally  linearized  model  around  the  system  operating 
point.  To  this  end,  in  analyzing  the  attack  effect  on  state  estimation,  we  adopt  the 
so-called  DC  model  [51].  In  the  DC  model,  for  the  ease  of  analysis,  the  AC  model 
(4.1)  is  linearized  around  the  system  state  where  all  voltage  phasors  are  equal  to 
1Z0,  and  only  real  part  of  the  measurements  are  retained: 

z  =  Hx  +  e,  (4.3) 

where  z  G  Mm  is  the  measurement  vector  consisting  of  real  part  of  line  flow  and  bus 
injection  measurements,  the  system  state  x  G  Mn  is  the  vector  of  voltage  phase 
angles  at  all  buses  except  the  reference  bus  (x  is  unknown,  but  deterministic), 
H  G  lmx"  is  the  DC  measurement  matrix  that  relates  the  system  state  to  bus 
injection  and  line  flow  amounts,  and  e  is  the  Gaussian  measurement  noise  with 
a  diagonal  covariance  matrix  S.  We  represent  the  noise  covariance  matrix  S 
as  E  =  cr2E,  where  S  is  a  diagonal  matrix  representing  the  variation  of  noise 
variances  across  different  meters  (Xu=i  =  1)>  and  u2  is  a  scaling  factor. 

Each  row  of  H  has  a  special  structure  depending  on  the  type  of  the  meter  [51]. 
For  ease  of  presentation,  consider  the  noiseless  measurement  z  =  Hx.  If  an  entry 
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Zk  of  z  is  the  measurement  of  the  line  flow  from  bus  i  to  bus  j,  is  Bij{xi  —  Xj) 
where  Bi3  is  the  line  susceptance  and  Xi  is  the  voltage  phase  angle  at  bus  i  [51]. 
If  Zk  is  the  measurement  of  bus  injection  at  i,  it  is  the  sum  of  all  the  outgoing 
line  flows  from  i,  and  the  corresponding  row  of  H  is  the  sum  of  the  row  vectors 
corresponding  to  all  the  outgoing  line  flows. 

The  analysis  based  on  the  DC  model  needs  to  be  verified  using  the  realistic 
AC  model  simulations;  we  demonstrate  in  Section  4.6  that  the  proposed  attack 
strategy  is  also  effective  in  the  AC  model  simulations. 

4.2.2  Adversary  Model 

We  consider  a  man-in-the-middle  attack  on  power  system  state  estimation.  As 
described  in  Fig.  4.1,  an  adversary  is  assumed  to  be  capable  of  modifying  the  data 
from  a  subset  of  analog  meters  Ja-  We  refer  to  the  meters  in  Ja  as  adversary 
meters. 

The  control  center  observes  the  corrupted  measurements  z  instead  of  the  orig¬ 
inal  measurements  z  in  (4.1).  We  assume  that  the  adversary  knows  the  line  pa¬ 
rameters  (i.e.,  the  measurement  function  h  and  the  measurement  matrix  H). 

The  adversarial  modification  is  mathematically  modeled  as  follows: 

z  =  z  +  a,  a  e  A,  (4.4) 

where  a  is  an  attack  vector,  and  A  is  the  set  of  feasible  attack  vectors  defined  as 
d  =  {c6  :  Cj  =  0,  Vi  ^  Ua}.  Note  that  A  fully  characterizes  the  ability  of  the 

adversary.  In  addition,  the  adversary  is  assumed  to  design  a  without  observing 
any  entry  of  z,  i.e.,  the  attack  does  not  require  any  real-time  observation. 
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4.2.3  Network  Observability  and  Covert  State  Attack 

For  state  estimation  to  be  feasible,  the  control  center  needs  to  have  enough  meter 
measurements,  from  which  the  system  state  can  be  uniquely  determined.  Formally, 
a  power  network  is  said  to  be  locally  observable  at  a  state  x0  if  the  system  state 
can  be  uniquely  determined  from  the  noiseless  meter  measurements  /i(x)  in  a 
neighborhood  of  x0.  This  implies  that  the  Jacobian  of  h  at  x0  has  full  rank. 
However,  due  to  the  intractability  of  checking  local  observability  for  all  feasible 
operating  points,  the  DC  model  (4.3)  is  generally  adopted  for  observability  analysis 
[26]:  the  network  is  said  to  be  observable  if  the  DC  measurement  matrix  H  has 
full  rank.  In  practice,  power  networks  should  be  designed  to  satisfy  observability. 
Hence,  we  assume  that  the  network  of  our  interest  is  observable  (he.,  H  has  full 
rank.) 

The  concept  of  network  observability  is  closely  related  to  the  feasibility  of  a 
covert  state  attack.  The  covert  state  attack  was  proposed  in  [18]  under  the  DC 
model:  if  there  exists  y  G  Mn  \  {0}  such  that  Hy  G  A,  then  setting  a  equal  to  Hy 
results  in 

z  =  Hx  +  e  +  a  =  H(x  +  y)  +  e,  (4.5) 

and  thus,  z  cannot  be  distinguished  from  a  normal  noisy  measurement  vector  with 
the  state  x  +  y.  Furthermore,  by  properly  scaling  the  attack  vector  ( e.g .,  aa),  the 
adversary  can  perturb  the  state  estimate  by  an  arbitrary  degree  ( e.g .,  ccy). 

It  is  shown  in  [24]  that  a  covert  attack  is  feasible  if  and  only  if  the  adversary 
can  control  a  critical  set  of  meters,  which  is  defined  as  a  set  of  meters  such  that  re¬ 
moving  the  set  from  the  network  renders  the  network  unobservable  while  removing 
any  proper  subset  of  it  does  not  [51].  Hence,  the  feasibility  condition  means  that 
removing  the  adversary  meters  renders  the  measurement  matrix  rank  deficient. 
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Figure  4.1:  Adversary  model  with  state  estimation  and  bad  data  test 

The  intuition  behind  the  condition  is  that,  for  any  y  6ln\  {0},  Hy  is  in  A  if  and 
only  if  Hy  has  zero  entries  for  all  non-adversary  meters;  the  latter  implies  that  the 
measurement  matrix  after  the  removal  of  the  rows  corresponding  to  the  adversary 
meters  is  rank  deficient,  because  y  is  in  its  null  space. 


4.3  State  Estimation  and  Bad  Data  Processing 

This  section  introduces  a  popular  approach  of  state  estimation  and  bad  data  pro¬ 
cessing,  which  we  assume  to  be  employed  by  the  control  center.  Once  the  control 
center  receives  the  measurements  z,  it  aims  to  obtain  the  estimate  x  of  the  system 
state  x.  Because  bad  data  entries  in  z  may  result  in  a  bias  in  the  state  estimate, 
the  control  center  iteratively  conducts  state  estimation  and  bad  data  detection  and 
identification  to  filter  out  possible  bad  data  entries  in  z. 

Fig.  4.1  illustrates  an  iterative  scheme  for  obtaining  x,  which  consists  of  three 
function  blocks:  State  Estimation,  Bad  Data  Detection,  and  Bad  Data  Identifi¬ 
cation  [40,51].  The  iteration  begins  with  the  initial  measurement  vector  z(l>  =  z 
and  the  initial  measurement  function  =  h  where  the  superscript  denotes  the 
index  for  the  current  iteration.  In  each  iteration,  (i)  the  state  estimate  is  obtained 
(State  Estimation),  (ii)  presence  of  bad  data  is  tested  (Bad  Data  Detection),  and 
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Table  4.1:  State  estimation  and  bad  data  processing 


Bad-Data-Processing(z,  h,  S) 

1 

zd)  ■(—  z;  /id)  4—  h\  k  i —  1 ; 

2 

while  (true) 

3 

(x(fc),r(fe))  4—  State- Estimation(z(fe), /i(fc)); 

4 

result  4—  Bad-Data-Detection(rd)); 

5 

if  result  ==  good 

6 

break; 

7 

else 

8 

(zd+d,  /i(fc+d)  Bad-Data-ID(rd))Zd),  /jd)); 

9 

end 

10:  k  <—  k  +  1; 

1 1 :  end 

12:  return  xd); 

(iii)  if  data  are  declared  to  be  bad,  one  data  entry  is  identified  as  bad  and  removed 
from  the  measurement  vector  (Bad  Data  Identification).  Table  4.1  provides  the 
pseudocode  for  the  overall  procedure.  In  the  following  subsections,  the  detailed 
operation  of  each  function  block  will  be  presented. 


4.3.1  State  Estimation  and  Bad  Data  Detection 

In  the  fctli  iteration,  State  Estimation  uses  (z^k\  /id))  as  an  input,  and  obtains  the 
weighted  least  squares  (WLS)  estimate  of  the  system  state: 

x(fc)  4  argmin(z(fc)  -  /i(fc)(x))T(S(fc))-1( z(fc)  -  h(fc)(x)),  (4.6) 

X 

where  Ed)  is  the  covariance  matrix  of  the  corresponding  noise  vector.  Based  on 
the  state  estimate,  the  residue  vector  is  also  evaluated: 

r«  Az(fc)  _/>«(£«).  (4.7) 
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We  assume  that  the  J(x)-test  [40,51]  is  employed  for  bad  data  detection:  Bad 
Data  Detection  makes  a  decision  based  on  the  sum  of  weighted  squared  residues: 


bad  data  if  Mfc))T (S^)  1 r ^ 

{  (4-8) 

I  good  data  if  (r(fc))T(£(fc))  <  r^k\ 

The  J(x)-test  is  widely  used  due  to  its  low  complexity  and  the  fact  that  the  test 
statistic  has  a  \2  distribution  if  the  data  are  good  [40].  The  latter  fact  is  used  to 
set  the  threshold  r ^  for  a  given  false  alarm  constraint. 


4.3.2  Iterative  Bad  Data  Identification  and  Removal 

If  Bad  Data  Detection  (4.8)  declares  that  the  data  are  good,  the  algorithm  returns 
the  state  estimate  x^-*  and  terminates.  However,  if  Bad  Data  Detection  declares 
that  the  data  are  bad,  Bad  Data  Identification  is  invoked  to  identify  and  remove 
one  bad  data  entry  from  the  measurement  vector. 

A  widely  used  criterion  for  identifying  a  bad  data  entry  is  the  normalized 
residue  [40,51],  which  is  considered  one  of  the  most  reliable  criteria  [41].  In  the 
normalized  residue  analysis,  each  r\  is  divided  by  its  standard  deviation  under  the 
good  data  hypothesis  (he.,  the  standard  deviation  of  when  there  exists  no  bad 
data  entry  in  z^'h)  If  there  exists  no  bad  data  entry  in  z^k\  and  the  state  estimate 
x^  is  close  to  the  actual  state  x,  the  distribution  of  r ^  can  be  approximated  by 
Af(0,W(fc)E^)  where 

W(fc)  =  I  -  Hw((HW)T(S(fe))-1(H(fc)))-1(H^)T(i:(fe))-1  (4.9) 

with  H(/4  denoting  the  Jacobian  of  h ^  at  x^fc^  and  I  denoting  the  identity  ma¬ 
trix  with  the  appropriate  size  (see  Appendix  of  [40]  for  the  detail.)  Hence,  the 
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normalized  residue  is  calculated  as 


f(fc)  =  S7(fc)r(fc), 


where  Q-  ^  is  a  diagonal  matrix  with 


n 


(k) 


o 

i 


if  {z}  is  a  critical  set2, 
otherwise. 


(4.10) 


(4.11) 


Once  the  normalized  residue  r ^  is  calculated,  the  meter  with  the  largest  |r?-A:  | 
is  identified  as  a  bad  meter.  Bad  Data  Identification  removes  the  row  of  z ^ 
and  the  row  of  hSk)  that  correspond  to  the  bad  meter  and  returns  the  updated 
measurement  vector  and  measurement  function  for  the  next  iteration,  denoted  by 

z(k+l)  an(j  ^0+1). 

Under  the  DC  model  (4.3),  State  Estimation,  Bad  Data  Detection,  and  Bad 
Data  Identification  are  the  same  with  that  in  the  AC  model,  except  that  the 
nonlinear  measurement  function  h^(x)  is  replaced  with  the  linear  function  H(A:^x 
(so,  the  Jacobian  is  the  same  everywhere.)  Note  that  the  WLS  state  estimate  (4.6) 
is  replaced  with  a  simple  linear  WLS  solution: 

x(fc)  =  ((H(fc))T(S(fc))-1(H(fc)))-1(H(A;))r(S(fc))”1z(fc),  (4.12) 

and  thus 

r(D  =  z(fc)  _  H(fc)x(fc)  =  W(fc)z(fc).  (4.13) 

2If  {*}  is  a  critical  set  (ie.,  removing  the  meter  i  makes  the  grid  unobservable),  its  residue  is 
always  equal  to  zero  [51],  and  the  corresponding  diagonal  entry  of  is  zero.  For  such  a 

meter,  the  normalizing  factor  is  0  such  that  its  normalized  residue  is  equal  to  0. 
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4.4  Data  Framing  Attack 

This  section  presents  a  new  attack  strategy  on  state  estimation,  referred  to  as 
data  framing  attack ,  which  exploits  the  bad  data  processing  to  remove  data  from 
some  normally  operating  meters  and  make  the  adversary  meters  appear  to  be 
trustworthy.  We  present  the  main  idea  and  the  QCQP  framework  for  the  optimal 
design  of  the  attack. 

We  focus  our  attention  to  the  case  where  the  adversary  cannot  control  enough 
meters  to  launch  a  covert  attack.  The  set  of  normal  meters  that  the  framing 
attack  aims  to  remove  (be.,  frame  as  bad  meters)  is  referred  to  as  the  target  set, 
denoted  by  Jt-  The  target  set  Jt  is  chosen  such  that  after  the  target  meters  are 
removed  from  the  grid,  a  covert  attack  becomes  feasible.  For  instance,  suppose 
that  J  is  a  critical  set.  The  feasibility  condition  of  the  covert  attack,  explained  in 
Section  4.2.3,  implies  that  if  J  \  Ja  is  removed  from  the  grid,  then  the  adversary 
with  Ja  can  launch  a  covert  state  attack,  because  further  removing  all  the  meters 
in  Ja  makes  the  grid  unobservable.  Therefore,  J  \  Ja  can  be  set  as  the  target  set 
Jt- 


The  resulting  state  perturbation  by  the  framing  attack  does  depend  on  the 
choice  of  the  target  set.  Finding  the  optimal  target  set  for  a  given  attack  objective 
is  certainly  an  important  problem.  However,  it  is  out  of  scope  of  this  chapter.  We 
focus  on  the  design  of  the  attack  vector  for  a  fixed  target  set. 
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4.4.1  Effect  of  Attack  on  Normalized  Residues 

To  analyze  how  the  attack  affects  the  bad  data  processing,  we  analyze,  under  the 
DC  model  (4.3),  the  adversarial  effect  on  the  normalized  residue  vector  in  the  first 
iteration.  In  this  subsection,  we  omit  the  superscript  to  simplify  notation:  all 
the  quantities  we  consider  are  associated  with  the  first  iteration  unless  otherwise 
specified. 

Suppose  that  z  is  the  measurement  vector  without  bad  data  under  the  DC 
model  (4.3).  The  normalized  residue  in  the  first  iteration  is  obtained  as 

r  =  =  STWz,  (4-14) 

where  f2  =  f2(1)  is  defined  as  in  (4.11). 

Due  to  the  normalization,  each  entry  iy  is  distributed  as  AA(0, 1)  unless  {i}  is 
a  critical  set  [51] ;  if  {z}  is  a  critical  set,  the  normalized  residue  for  the  meter  i  is 
always  equal  to  zero  for  any  z. 

If  an  attack  vector  a  is  added,  the  resulting  normalized  residue  is 

r  =  OW(z  +  a)  =  f2Wz  +  f2Wa.  (4-15) 

Thus,  if  {?}  is  not  a  critical  set,  r,  is  distributed  as  A/”((f2Wa)j,  1);  if  {z}  is  critical, 
=  (f2Wa)j  surely. 

Recalling  that  the  absolute  normalized  residues  (he.,  |fy|)  are  the  statistics  used 
for  identifying  the  bad  data  entries,  one  intuitive  heuristic  to  get  the  target  meters 
removed  is  to  make  the  mean  energy  of  the  normalized  residues  at  the  target  meters 
as  large  as  possible.  Making  the  target  meters  have  large  normalized  residues  in 
the  first  iteration  is  of  course  not  a  guarantee  of  their  removal  in  the  following 
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iterations.  Nevertheless,  this  is  a  reasonable  heuristic  to  avoid  the  difficult  task  of 
analyzing  the  dynamic  adversarial  effect  in  subsequent  iterations.  Note  that 


(4.16) 


E 


where  C  is  the  number  of  the  meters  in  Jt  that  do  not  form  a  single-element  critical 
set.  Therefore,  maximizing  the  mean  energy  of  the  normalized  target  residues  is 
equivalent  to  maximizing  X)iejT(^Wa)f  =  ||STHWa|||  where  St  6  Ml3Tlxm  is 
the  row-selection  matrix  which  retains  only  the  rows  corresponding  to  the  target 
meters. 

4.4.2  Optimal  Framing  Attack  via  QCQP 

The  ultimate  objective  of  the  framing  attack  is  to  gain  an  ability  to  perturb  the 
state  estimate  by  an  arbitrary  degree.  To  this  end,  the  framing  attack  aims  to 
accomplish  two  tasks. 

The  first  is  to  make  the  bad  data  processing  remove  the  target  meters  such 
that  the  network  with  the  remaining  meters  becomes  vulnerable  to  a  covert  state 
attack  by  the  adversary.  As  discussed  in  Section  4.4.1,  we  attempt  to  achieve  this 
goal  by  maximizing  the  mean  energy  of  the  normalized  target  residues,  which  is 
equivalent  to  maximizing  ||STpWa|||. 

The  second  task  is  to  ensure  that  the  attack  becomes  covert  after  the  target 
meters  are  removed,  thereby  making  arbitrary  state  perturbation  possible.  Let 
Ho  denote  the  m  x  n  measurement  matrix  obtained  from  H  by  replacing  the 
rows  corresponding  to  the  target  meters  with  zero  row  vectors.  Then,  the  attack 
becomes  covert  (he.,  the  attack  vector  lies  in  the  column  space  of  the  measurement 
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matrix)  after  the  target  meters  are  removed,  if  and  ony  if  a  is  in  fR(H0).  Therefore, 
we  restrict  the  attack  vector  a  to  be  not  only  in  the  feasible  set  A  but  also  in  IR(Ho). 


Based  on  the  aforementioned  intuition,  we  solve  the  following  optimization  to 
find  the  optimal  direction  to  align  the  attack  vector: 


(4.17) 


maxa  ||ST^Wa||! 
subj.  ||a|||  =  1,  a  e  tt(H0)  n  A. 

The  optimization  (4.17)  gives  the  optimal  direction  a*  of  the  attack  vector  to 
maximize  the  mean  energy  of  the  normalized  target  residues,  among  the  feasible 
directions  that  render  the  attack  covert  after  the  target  meters  are  removed. 


To  provide  a  more  intuitive  description  of  the  feasible  set  in  (4.17),  we  in¬ 
troduce  the  (m  —  |Ja|  —  |?t|)  x  n  matrix  H  obtained  from  H  by  removing  the 
rows  corresponding  to  the  adversary  and  target  meters.  It  can  be  easily  seen  that 
a  E  D£(Hq)  D  A  if  and  only  if  a  =  H0x0  for  some  x0  E  74(H).  Therefore,  the  dimen¬ 
sion  of  3l(Ho)  fl  A  is  equal  to  the  dimension  of  74(H).  For  instance,  if  Ja  U  Jt  is  a 
critical  set,  H  has  rank  n  —  1,  and  its  null  space  has  dimension  one.  Therefore,  in 
this  case,  3?(H0)  C\A  is  a  one-dimensional  space,  and  there  is  no  need  to  search  for 
the  optimal  direction.  On  the  other  hand,  if  Ja  U  Jt  contains  more  than  one  crit¬ 
ical  sets,  the  dimension  of  74(H)  is  greater  than  one,  and  the  optimization  (4.17) 
searches  for  the  optimal  direction  among  the  infinite  set  of  feasible  directions. 


Finally,  we  set  an  attack  vector  a  as  r/ a*  where  rj  E  R  is  a  parameter  that 
adjusts  the  direction  (he.,  positive  or  negative  depending  on  the  sign  of  rj)  and 
the  magnitude  of  the  resulting  state  perturbation.  It  is  important  to  point  out 
that  a  sufficiently  large  \r]\  is  necessary  for  successful  removal  of  the  target  meters. 
Because,  the  mean  of  the  J(x)-test  statistic  (of  Bad  Data  Detection)  increases 
linearly  with  respect  to  \r)  [40],  and  we  want  the  test  statistic  to  be  larger  than  the 
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threshold  in  multiple  iterations  such  that  Bad  Data  Identification  can  be  invoked 
enough  times  to  remove  all  the  target  meters. 


In  practice,  real-world  power  meters  have  very  high  signal-to-noise  ratios 
(SNRs)  [76],  which  means  that  even  a  small  attack  vector  can  be  detected  by 
the  J(x)-test.  Therefore,  the  necessary  size  of  |p|  to  invoke  Bad  Data  Identifi¬ 
cation  in  multiple  iterations  is  expected  to  be  reasonably  small.  The  numerical 
examples  in  Section  4.6  demonstrate  that  the  framing  attack  that  perturbs  the 
measurement  vector  by  less  than  1%  in  L  i -norm  can  succeed  under  a  moderately 
high  SNR  setting. 


The  optimization  (4.17)  can  be  written  as  a  QCQP: 


minq  q2  Pq 

subj.  qTQq  —1  =  0,  q  G 

where 

P  =  -(STflWB)T(STnWB),  Q  =  btb, 


(4.18) 


(4.19) 


and  B  e  Mmxp  i§  the  basis  matrix  of  the  p- dimensional  vector  space  fk(H0)  PI  A. 
Note  that  the  dimension  p  is  nonzero  because  the  target  meters  are  set  such  that 
a  covert  attack  becomes  feasible  after  their  removal.  In  addition,  P  is  negative 
semidehnite,  and  Q  is  positive  definite  since  B  has  full  column  rank.  The  posi¬ 
tive  definiteness  of  Q  implies  that  a  solution  exists  (he.,  the  objective  function  is 
bounded  below.) 


The  KKT  conditions  for  (4.18)  are  as  follows: 

Pq  +  A(Qq)  =  0,  qTQq  —  1  =  0,  (4.20) 

where  A  is  the  Lagrange  multiplier  for  the  equality  constraint.  The  optimal  solution 
q*  of  (4.18)  is  the  one  that  results  in  the  minimum  objective  function  value  among 
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all  (A,q)  pairs  satisfying  the  KKT  conditions  (4.20). 


The  KKT  conditions  (4.20)  imply  that 


Q  xPq  =  Aq; 

q7  Pq  =  qT(— AQq)  =  -AqrQq  =  -A. 


(4.21) 


For  any  solution  (A,  q)  of  (4.20),  the  first  equation  means  that  A  should  be  an 
eigenvalue  of  Q-1P,  and  q  should  be  in  the  corresponding  eigenspace.  The  second 
equation  means  that  the  objective  function  value  at  q  is  equal  to  —A.  Therefore, 
we  can  find  an  optimal  solution  q*  of  (4.18)  as  follows:  (i)  find  the  maximum 
eigenvalue  of  Q  7P,  and  (ii)  find  an  eigenvector  q*  in  the  corresponding  eigenspace 
that  satisfies  (q*)TQq*  -i  =  o.  Once  q*  is  found,  an  optimal  solution  a*  of  the 
original  problem  (4.17)  is  constructed  as  a*  =  Bq*. 


4.5  Factor-of-Two  Result 

In  this  section,  we  demonstrate  that  the  framing  attack  enables  the  adversary 
controlling  only  a  half  of  a  critical  set  of  meters  to  perturb  the  state  estimate  by 
an  arbitrary  degree.  Specifically,  given  a  partition  { Ji ,  J2}  of  a  critical  set  of  meters, 
we  present  a  sufficient  condition  under  which  the  adversary  can  control  one  of 
or  J2  to  perturb  the  state  estimate  by  an  arbitrary  degree.  We  provide  numerical 
evidences  from  IEEE  benchmark  networks  that  for  the  critical  sets  associated  with 
cuts,  we  can  find  a  partition  with  |  ~  |J2|  satisfying  the  sufficient  condition. 
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4.5.1  Estimation  of  Adversarial  State  Estimate  Perturba¬ 
tion 


The  exact  analysis  of  how  the  framing  attack  would  perturb  the  state  estimate  at 
the  end  of  the  iterative  bad  data  processing  is  a  difficult  task.  However,  assuming 
that  the  meter  SNRs  are  high,  we  can  estimate  the  effect  of  the  framing  attack  as 
follows.  Since  SNRs  of  most  practical  meters  tend  to  be  higher  than  46  dB  [76], 
the  high  meter  SNR  assumption  is  reasonable. 


Suppose  that  the  attacker  adds  the  attack  vector  a  to  z,  and  the  bad  data  test 
is  executed  on  z.  The  measurement  vector  in  the  fcth  iteration  is 


z(fc)  =  H(fc)x  +  a(fc)  +  e(fc),  (4.22) 

where  a^  and  e ^  are  obtained  from  H.  a  and  e  by  removing  the  (k  —  1) 

rows  corresponding  to  the  meters  identified  as  bad  until  the  (k  —  l)st  iteration. 
The  state  estimate  x1-^  is 

[(HW)t(SW)-1HW]-1(HW)t(SW)-1zW 

=  x  +  [(HW)T(SW)-1HW]-1(HW)T(SW)-1(a(':)  +  e(fc)). 

Hence,  the  state  estimate  perturbation  after  the  fcth  iteration  is 

x(fc)  -  x  =  [(H(fc))T(S(fc))-1H(fc)]-1(H(fc))T(S(fc))-1(a(fc)  +  e(fc)).  (4.24) 


In  addition,  the  residue  vector  is 


=  WW(H^x  +  a(fc)  +  e(fc)) 
—  \\dfc)(a  (fe)  +  e^). 


(4.25) 


From  (4.24)  and  (4.25),  we  can  see  that  both  the  state  estimate  perturbation  and 
the  residue  vector  do  not  depend  on  the  actual  state  x.  Considering  that  bad 
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data  detection  and  identification  at  each  iteration  exclusively  rely  on  the  residue 
vector,  the  observation  from  (4.24)  and  (4.25)  implies  that  if  we  are  interested  in 
analyzing  how  much  the  attack  perturbs  the  final  state  estimate,  i.e.,  x^A)  —  x, 
where  N  denotes  the  total  number  of  iterations,  we  can  simply  work  with  a  +  e  by 
assuming  that  x  is  equal  to  0. 

Furthermore,  if  the  meter  SNRs  are  significantly  large  (i.e.,  a2  <C  1),  we  can 
estimate  the  resulting  state  perturbation  by  running  the  noiseless  version  of  the 
bad  data  processing  on  the  attack  vector  a  and  checking  the  resulting  x(jV).  The 
noiseless  version  means  the  algorithm  which  the  bad  data  processing  converges  to 
as  a2  decays  to  0.  Specifically,  £  is  replaced3  by  £,  and  in  the  fct.h  iteration,  the 
detector  declares  presence  of  bad  data  if  and  only  if  (rW)-r(£hfc))-1r(fc)  >  o  (i.e., 
the  data  are  declared  to  be  good  if  and  only  if  the  state  estimation  results  in  a 
zero  residue  vector.) 

4.5.2  Factor-of-Two  Theorem  for  Critical  Sets 

Suppose  that  {Ji,  U2}  is  a  partition  of  a  critical  set,  and  let  H  denote  the  measure¬ 
ment  matrix  after  removing  the  meters  in  Ji  U  J2  from  the  grid.  Since  Ji  U  J2  is 
a  critical  set,  H  has  rank  n  —  1,  and  the  dimension  of  its  null  space  is  one.  Let 
Ax  denote  a  unit  basis  vector  of  the  null  space  of  H.  Recalling  the  discussion  in 
Section  4.4.2,  if  Ji  is  the  set  of  adversary  meters,  and  J2  is  the  target  set,  then  the 
framing  attack  aligns  the  attack  vector  along  FR Ax,  where  FR  is  the  mxn  matrix 
obtained  from  H  by  replacing  the  rows  corresponding  to  the  meters  in  J2  with  zero 

3Note  that  State  Estimation  and  Bad  Data  Identification  are  not  affected  by  the  value  of  a2. 
Because,  a2  gets  cancelled  out  in  the  state  estimate  expression  (4.12),  and  Bad  Data  Identification 
depends  on  the  relative  magnitudes  of  each  residue  with  respect  to  other  residues,  which  are  not 
affected  by  the  value  of  a2.  Only  Bad  Data  Detection  is  affected  by  the  decaying  a2. 
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row  vectors  (H2  is  defined  in  the  same  way  by  replacing  the  rows  corresponding 
to  Ji.) 

The  following  theorem  provides  a  sufficient  condition  that  guarantees  that  the 
framing  attack  can  use  one  of  3\  and  J2  to  perturb  the  state  estimate  by  an  arbitrary 
degree  under  the  high  SNR  setting.  The  condition  is  based  on  the  result  of  running 
the  deterministic  test  described  in  Section  4.5.1. 


Theorem  4.5.1  Suppose  that  if  we  run  the  noiseless  version  of  the  state  estima¬ 
tion  and  the  bad  data  processing  on  Hi  Ax,  then  there  exists  a  unique  state  y  €  Mn 
such  that  the  final  state  estimate  is  always  equal  to  y  (i.e.,  x(A  )  =  y)  regardless 
of  whatever  decisions  are  made  under  tie4  situations  in  Bad  Data  Identification. 
Under  this  condition,  the  following  hold  for  any  true  state  x  e  ln; 

(1)  Suppose  y  t -  0.  If  the  framing  attack  using  3\  as  adversary  meters  and  J2 
as  target  meters  (i.e.,  a  =  r/HiAx  where  p  E  M.  is  a  scaling  factor)  is  launched, 
then 

lim  Pr(z(iV)  =  H(a°(x  +  py)  +  e(A°)  =  1,  (4.26) 

<T2— >-0 

where  N  is  the  random  variable  representing  the  total  number  of  iterations  in  the 
bad  data  processing. 

(2)  Suppose  y  7^  Ax.  If  the  framing  attack  using  J2  as  adversary  meters  and 
3 1  as  target  meters  (i.e.,  a  =  p H2Axj  is  launched,  then 

lim  Pr(z(7V)  =  Hw(x  +  t?(Ax  -  y))  +  e(A°)  =  1.  (4.27) 

O'2— >0 

4It  is  possible  that  a  tie  may  occur  in  Bad  Data  Identification  at  some  iteration:  i.e.,  the 
largest  absolute  normalized  residue  is  assumed  by  more  than  one  meter.  In  a  tie  situation, 
we  assume  that  Bad  Data  Identification  chooses  an  arbitrary  meter  with  the  largest  absolute 
normalized  residue. 
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Proof:  See  Section  4.7  ■ 

The  event  {z(A  )  =  H(iV)(x  +  ijy)  +  e^}  means  that  the  final  measurement 
vector  at  the  end  of  the  bad  data  processing  is  a  noisy  measurement  vector  with 
the  state  perturbed  by  r/y.  Theorem  4.5.1  implies  that  if  the  condition  is  met, 
then  at  least  one  of  3±  and  J2  can  be  used  by  the  framing  attack  to  perturb  the 
state  estimate  by  an  arbitrary  degree,  because  y  cannot  be  simultaneously  0  and 
Ax.  Especially,  if  the  condition  holds  for  the  partition  with  |Jx|  =  |J2|,  then  the 
adversary  controlling  only  a  half  of  the  critical  set  can  perturb  the  state  estimate 
by  an  arbitrary  degree. 

One  important  question  is  whether  a  partition  {J1;J2}  with  |Jx|  ~  |J2|  that 
satisfies  the  condition  of  Theorem  4.5.1  can  be  found  in  general.  To  answer  this 
question,  we  investigated  critical  sets  associated  with  cuts 5  in  the  IEEE  14-bus 
and  118-bus  networks,  where  every  bus  has  an  injection  meter  and  every  line  has 
line  meters  for  both  directions.  The  spanning  tree  observability  criterion  in  [26] 
implies  that  the  set  J  of  the  meters  associated  with  a  cut  (ie.,  the  set  of  the  line 
meters  on  the  cut-set  lines  and  the  injection  meters  on  the  both  ends  of  the  cut-set 
lines)  forms  a  critical  set  if  removing  the  cutset  decomposes  the  topology  into  two 
connected  graphs.  For  instance,  the  cut  in  Fig.  4.2  disconnects  the  bus  3  from  the 
rest  of  the  network,  and  {{2,  3},  {3, 4}}  is  the  associated  cut-set.  The  set  of  circled 
red  meters  is  the  critical  set  associated  with  the  cut. 

We  executed  20,000  runs  of  the  random  contraction  algorithm  by  Karger  and 
Stein  [77] — a  randomized  algorithm  for  finding  a  cut — and  found  118  cuts  in  the 
14-bus  network  and  290  cuts  in  the  118-bus  network.  For  each  cut,  we  built  a 

5 A  cut  of  an  undirected  graph  (V,  £)  is  defined  as  a  partition  {Ifi,  V2}  of  V  consisting  of  two 
nonempty  subsets,  and  the  associated  cut-set  is  the  subset  of  lines  connecting  two  vertices  in 
different  partitions. 
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THREE  WINDING 
TRANSFORMER  EQUIVALENT 


Figure  4.2:  IEEE  14-bus  network:  the  rectangles  on  lines  and  buses  represent 
line  flow  meters  and  bus  injection  meters  respectively.  The  line  meter  on  the  line 
that  is  closer  to  i,  measures  the  power  flow  from  i  to  j.  The  red  dashed  line 
describes  a  cut,  and  the  circled  meters  are  the  meters  associated  with  the  cut. 


partition  {Ji,  J2}  of  the  critical  set  J  associated  with  the  cut  such  that  |Ji|  ~  y :  Ji 
consists  of  only  the  line  meters  (both  directions)  associated  with  a  subset  of  lines  in 
the  cut-set  such  that  |Ji|  —  y  <1,  and  J2  is  set  to  be  J\ Ji.  In  both  networks,  for 
every  cut  we  considered,  the  partition  constructed  in  the  aforementioned  manner 
satisfied  the  condition  of  Theorem  4.5.1;  this  suggests  that  the  sufficient  condition 
is  not  stringent,  at  least  for  critical  sets  associated  with  cuts6. 


6The  average  size  of  the  critical  sets  we  considered  is  15.7  for  the  14-bus  case  and  12.7  for  the 
118-bus  case. 
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4.6  Numerical  Results 

We  tested  the  performance  of  the  framing  attack  with  the  IEEE  14-bus  and  118- 
bus  networks  under  both  the  DC  and  AC  models.  The  AC  simulation  results 
demonstrate  the  efficacy  of  the  framing  attack  under  the  real-world  power  system 
setting.  Because  the  ultimate  goal  of  the  attack  is  to  perturb  the  state  estimate, 
we  measure  the  mean  L2-norm  of  the  resulting  state  estimate  error: 

E[||*  -  x||2], 

where  x  is  the  state  estimate,  and  x  is  the  true  state. 

4.6.1  Simulation  Setting 

In  the  IEEE  14-bus  and  118-bus  networks,  we  chose  representative  attack  scenar¬ 
ios  (i.e.,  adversary  meters  and  target  meters)  and  tested  the  performance  of  the 
framing  attack.  For  each  case,  we  ran  Monte  Carlo  simulations  to  evaluate  the 
mean  state  estimate  perturbation.  In  each  Monte  Carlo  run,  the  true  state  x  was 
generated  by  a  multivariate  Gaussian  distribution  with  small  variances.  Its  mean 
was  set  as  the  operating  state  given  by  the  IEEE  14-bus  and  118-bus  data  [78]. 
Based  on  the  generated  state  x,  the  noisy  measurements  were  generated  by  the 
measurement  model  (he.,  h(x)  +  e).  The  attack  vector  was  constructed  based  on 
the  DC  measurement  matrix  H  as  described  in  Section  4.4.  Once  constructed,  the 
atack  vector  was  added  to  the  noisy  measurements,  and  state  estimation  and  bad 
data  processing7  were  executed  on  the  corrupted  measurements.  After  the  bad 
data  processing  finished,  we  measured  ||x^  —  xl|2- 

7The  false  alarm  rate  of  the  bad  data  detector  is  set  to  be  0.04  throughout  all  the  simulations. 
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The  main  difference  between  the  DC  and  AC  simulations  is  that  we  used  differ¬ 
ent  measurement  models  for  data  generation.  Note  that  the  design  of  the  framing 
attack  was  studied  for  the  DC  model  which  has  only  the  real  part  of  the  mea¬ 
surements.  For  the  AC  simulations,  we  designed  the  attack  vector  based  on  the 
DC  model,  and  the  attack  modified  only  the  corresponding  real  part  of  the  mea¬ 
surements.  Considering  the  linear  decoupled  model  (see  Chapter  2.7  in  [51]),  such 
addition  of  the  attack  vector  is  expected  to  modify  primarily  the  bus  voltage  phase 
angles  and  have  little  effect  on  the  bus  voltage  magnitudes.  Hence,  in  interpreting 
the  AC  results,  we  focus  on  the  perturbation  in  the  phase-angle  part  of  the  state 
estimate. 


For  comparison,  we  also  executed  the  conservative  scheme  in  [24],  which  aims 
to  perturb  the  state  estimate  by  the  maximum  degree  while  not  raising  any  alarm 
in  the  bad  data  processing.  This  scheme  has  been  considered  as  the  best  the 
adversary  incapable  of  a  covert  state  attack  can  do.  In  the  conservative  scheme, 
the  attack  vector  was  designed  as  a  solution  to 


maxae4  ||(HT£-1H)-1HT£-1a||! 
subj.  r7  £_1r  <  r, 


(4.28) 


where  the  constraint  guarantees  that  the  alarm  is  not  raised  at  all,  and  the  objective 
function  is  the  resulting  perturbation  of  the  state  estimate  due  to  the  attack  vector. 


4.6.2  Simulation  Results  with  14-Bus  Network 

We  first  tested  the  case  where  the  adversary  can  control  only  a  half  of  a  critical  set. 
Specifically,  we  considered  the  adversary  who  can  control  (2,3),  (3,4),  and  (4,3): 
(i,j)  denotes  the  line  meter  for  the  power  flow  from  i  to  j ,  and  (z)  denotes  the 
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injection  meter  at  bus  i.  The  target  meters  were  set  to  be  (3,2),  (2),  (3),  and  (4) 
such  that  the  set  of  adversary  meters  and  target  meters  is  the  critical  set  associated 
with  the  cut  in  Fig.  4.2.  We  tested  the  framing  attack  with  three  different  attack 
magnitudes:  HaHi  is  1%,  2%,  or  3%  of  || z || i . 

Fig.  4.3  shows  the  resulting  state  estimate  perturbations  versus  the  meter  SNR 
in  the  DC  simulations.  The  meter  SNR  ranges  from  26  dB  to  46  dB  (equivalently, 
the  noise-to-signal  amplitude  ratio  ranges  from  5%  to  0.5%.)  Note  that  the  SNR 
range  we  tested  is  no  greater  than  the  SNR  of  most  practical  meters  deployed 
in  real-world  power  networks  [76].  The  normal  state  estimate  error  and  the  state 
estimate  error  under  the  conservative  scheme  are  very  close,  and  both  decay  to  zero 
as  the  SNR  increases.  However,  the  state  estimate  error  under  the  framing  attack 
converges  to  a  constant,  which  is  proportional  to  the  attack  magnitude,  as  the  SNR 
increases.  The  result  implies  that  the  framing  attack  can  adjust  the  state  estimate 
perturbation  by  choosing  a  proper  attack  magnitude.  The  effect  of  the  framing 
attack  becomes  distinct  from  the  normal  state  estimate  error  when  the  SNR  is 
high.  To  demonstrate  the  relative  effect  of  the  framing  attack  with  respect  to  the 
normal  error,  Fig.  4.4  shows  the  resulting  state  estimate  perturbation  normalized 
with  respect  to  the  state  estimate  error  under  the  non-attack  scenario.  Under 
the  same  attack  setting,  Fig.  4.5  shows  the  state  estimate  perturbation  versus  the 
meter  SNR  in  the  AC  simulations.  It  can  be  observed  that,  especially  in  the  high 
SNR  region,  the  perturbation  amount  is  proportional  to  the  attack  magnitude. 
The  plots  imply  that  the  effect  of  the  framing  attack  persists  in  the  AC  model, 
thereby  suggesting  that  the  framing  attack  can  be  detrimental  to  the  real-world 
power  system  state  estimation. 

Second,  we  demonstrate  that  the  framing  attack  may  pursue  perturbation  in 
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State  estimate  error  (degree).  False  alarm  rate  =  0.04. 


Figure  4.3:  DC  simulations  with  the  14-bus  network:  1,000  Monte  Carlo  runs.  The 
adversary  meters  are  (2,3),  (3,4),  and  (4,3),  and  the  target  meters  are  (3,2),  (2), 
(3),  and  (4). 


Normalized  SE  error  (%).  False  alarm  rate  =  0.04. 


Figure  4.4:  DC  simulations  with  the  14-bus  network:  1,000  Monte  Carlo  runs.  The 
adversary  meters  are  (2,3),  (3,4),  and  (4,3),  and  the  target  meters  are  (3,2),  (2), 
(3),  and  (4). 

various  directions  by  choosing  a  different  target  set.  We  considered  the  case  that 
the  adversary  controls  (2,3),  (3,4),  (4,3),  (6,12),  (12,6),  and  (12,13).  Note  that 
the  adversary  still  cannot  control  any  critical  set,  and  thus  a  covert  attack  is  in¬ 
feasible.  The  framing  attack  with  any  of  the  following  three  different  target  sets 
successfully  perturbed  the  state  estimate:  (i)  (2),  (3),  (4),  (3,2),  (6),  (12),  (13), 
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State  estimate  error  (degree).  False  alarm  rate  =  0.04. 


Figure  4.5:  AC  simulations  with  the  14-bus  network:  1,000  Monte  Carlo  runs. 
The  adversary  meters  are  (2,3),  (3,4),  and  (4,3),  and  the  target  meters  are  2,  3, 
4,  (3,2). 

and  (13,12);  (ii)  (2),  (3),  (4),  and  (3,2);  (iii)  (6),  (12),  (13),  and  (13,12).  For  in¬ 
stance,  Fig.  4.6  shows  the  state  estimate  perturbation  versus  the  meter  SNR  in  the 
AC  simulations  for  the  first  target  set.  While  the  three  target  sets  all  resulted  in 
successful  state  estimate  perturbation,  each  resulted  in  a  different  direction  of  per¬ 
turbation.  For  each  target  set,  Table  4.2  shows  the  three  buses,  whose  phase  angle 
estimates  were  most  significantly  perturbed,  and  the  mean  perturbation  of  their 
phase  angle  estimates;  positive  perturbation  means  overestimation,  and  negative 
perturbation  means  underestimation.  The  table  demonstrates  that  the  adversary 
controlling  a  large  number  of  meters  may  adjust  the  direction  of  perturbation  by 
choosing  a  proper  target  set.  Note  that  with  the  second  target  set,  whose  asso¬ 
ciated  critical  set  (i.e.,  the  critical  set  contained  in  Ja  U  Jt)  isolates  bus  3,  the 
framing  attack  perturbed  the  bus-3  phase  angle  estimate  significantly  while  hav¬ 
ing  little  effect  on  other  bus  phase  angle  estimates.  This  is  expected  because  once 
the  target  meters  are  successfully  removed  from  the  network,  the  adversary  can 
control  all  the  real  meter  measurements  that  depend  on  the  bus-3  phase  angle. 
The  similar  effect  can  be  observed  for  the  framing  attack  with  the  third  target 
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State  estimate  error  (degree).  False  alarm  rate  =  0.04. 


Figure  4.6:  AC  simulations  with  the  14-bus  network:  1,000  Monte  Carlo  runs. 
The  adversary  meters  are  (2,3),  (3,4),  (4,3),  (6, 12),  (12,6),  and  (12, 13),  and  the 
target  meters  are  2,  3,  4,  (3,2),  6,  12,  13,  and  (13, 12). 

Table  4.2:  The  three  buses  whose  phase  angles  are  most  significantly  perturbed  by 
each  attack:  AC  simulations,  1,000  Monte  Carlo  runs,  SNR  =  46dB. 


(2),  (3),  (4),  (3.2), 

(2),  (3) 

(6).  (12), 

(6),  (12),  (13),  (13,12) 

(4).  (3,2) 

(13),  (13,12) 

1)  bus  12:  2.075° 

1)  bus  3:  -2.183° 

1)  bus  12:  2.878° 

2)  bus  3:  0.272° 

2)  bus  14:  0.182° 

2)  bus  14:  0.005° 

3)  bus  14:  -0.180° 

3)  bus  9:  0.168° 

3)  bus  9:  0.004° 

set,  whose  associated  cut  isolates  bus  12.  On  the  other  hand,  for  the  first  target 
set,  once  the  target  meters  are  removed,  the  adversary  controls  all  the  real  meter 
measurements  that  depend  on  the  bus-3  phase  angle  or  the  bus-12  phase  angle. 
In  this  case,  the  framing  attack,  constructed  by  the  QCQP  framework  in  (4.17), 
perturbed  both  bus-3  and  bus-12  phase  angle  estimates. 
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118-bus  State  estimate  error  (degree).  False  alarm  rate  =  0.04. 


Figure  4.7:  AC  simulations  with  the  118-bus  network:  250  Monte  Carlo  runs.  The 
adversary  meters  are  (20,  21),  (21,  20),  and  (21,  22),  and  the  target  meters  are  (20), 
(21),  (22),  and  (22,21). 

4.6.3  Simulation  Results  with  118-Bus  Network 


Through  the  simulations  with  the  118-bus  network,  we  aim  to  demonstrate  the 
effect  of  the  framing  attack  on  a  larger  network.  We  considered  the  scenario  where 
the  adversary  controls  (20,21),  (21,20),  and  (21,22),  and  the  target  meters  are 
(20),  (21),  (22),  and  (22,21);  i.e.,  the  set  of  the  adversary  meters  and  the  target 
meters  is  the  critical  set  associated  with  the  cut  isolating  the  bus  21  from  the  rest  of 
the  network.  Fig.  4.7  shows  the  state  estimate  errors  under  the  non-attack  scenario 
and  the  framing  attacks  with  different  attack  magnitudes  in  the  AC  simulations. 
The  plots  imply  that  the  framing  attack  successfully  perturbs  the  state  estimate, 
and  thus  its  adversarial  effect  persists  in  a  larger  network. 
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4.7  Proof  of  Theorem  4.5.1 

Let  S  denote  the  set  of  sequences  of  meter  removals  that  can  possibly  happen 
when  the  noiseless  version  of  bad  data  processing  is  executed  on  Hi  Ax:  i.e., 
(ay, . . .  ,clm)  €  S  if  and  only  if  some  decisions  under  tie  situations  may  result  in 
the  removal  of  the  meters  {ai, . . . ,  om}  in  the  order  of  ai, . . . ,  om-  The  cardinality 
of  §  can  be  greater  than  1  since  different  decisions  under  tie  situations  may  result 
in  different  sequences  of  meter  removals. 

For  any  sequence  (ai, . . . ,  czm)  G  S,  the  existence  of  such  y — as  described  in  the 
condition — implies  that  if  all  the  meters  in  the  sequence  are  removed,  the  remaining 
part  of  HiAx,  denoted  by  H^Ax,  is  equal  to  H^y,  where  H(fU)  and  H^M)  are 
obtained  from  Hi  and  H  respectively,  by  removing  the  rows  corresponding  to  all 
meters  in  the  sequence. 

Now,  consider  running  the  bad  data  test  on  Hx+Hi  Ax+e.  The  equation  (4.25) 
implies  that  the  residue  vector  in  each  iteration  only  depends  on  Hi  Ax  +  e.  In 
addition,  as  a2  decreases  to  zero,  the  results  of  bad  data  detection  and  identification 
heavily  depend  on  Hi  Ax,  and  thus  the  sequence  of  removed  meters  becomes  highly 
likely  to  be  in  S.  Formally, 

lim  Pr((ai, . . . ,  ayv)  G  S)  =  1,  (4.29) 

(T2— >0 

where  (ai, . . . ,  a n)  is  a  random  sequence  of  meters  removed  by  the  bad  data  test. 
Let  HlT  and  denote  the  random  matrix  and  vector  obtained  from  H  and  e 
respectively  by  removing  the  rows  corresponding  to  {ai, . . . ,  a^}. 

The  event  {(ay, . . . ,  ayv)  G  S}  implies  that  H^Ax  =  H(Ar)y>  and  thus 

z(JV)  =  (H(JV)X  +  e(JV))  +  H‘iV)Ax  =  (H(iV)x  +  e(iV))  +  H{N)y.  (4.30) 
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Therefore, 

lim  Pr(z(A°  =  (Hwx  +  e(JV))  +  H(JV)y)  =  1-  (4.31) 

cr2->-0 

Note  that  replacing  the  attack  vector  Hi  Ax  with  HipAx  simply  changes  the  above 
to 

lim  Pr(z(7V)  =  (H(iV)x  +  e(JV))  +  H(JV)r/y)  =  1.  (4.32) 

cr2->-0 

Now,  consider  running  the  bad  data  test  over  Hx  +  H2  Ax  +  e;  this  is  the  case 
when  the  framing  attack  is  launched  with  the  partition  J2.  First,  note  that 

HAx  =  Hi  Ax  +  H2  Ax.  (4.33) 

Therefore,  running  the  bad  data  test  on  Hx  +  H2Ax  +  e  is  equivalent  to  running 
it  on  H(x  +  Ax)  —  Hi  Ax  +  e. 

Suppose  we  run  the  noiseless  version  of  the  bad  data  processing  on  — HiAx. 
The  set  of  sequences  of  meter  removals  that  can  possibly  happen  is  equivalent  to 
S,  because  the  sign  change  only  flips  the  signs  of  residue  entries;  it  does  not  affect 
their  absolute  values,  which  are  the  statistics  used  for  detection  and  identification 
of  bad  data  entries.  Furthermore,  it  can  be  easily  seen  that  the  final  state  estimate 
is  always  equal  to  — y  regardless  of  whatever  decisions  are  made  under  the  tie 
situations. 

Now,  consider  again  running  the  bad  data  test  on  H(x  +  Ax)  —  HiAx  +  e, 
which  is  equivalent  to  Hx  +  H2  Ax  +  e.  In  exactly  the  same  manner  as  we  derived 
(4.31),  we  can  derive  the  following: 

lim  Pr(z(iV)  =  H(A°(x  +  Ax)  +  e{N)  +  H(iV)(-y))  =  1,  (4.34) 

(T2— >-0 

or  equivalently, 

lim  Pr(z(A°  =  H(Ar)x  +  e(A°  +  H(A°(Ax  -  y))  =  1.  (4.35) 
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When  the  attack  vector  H2Ax  is  scaled  by  rj  (he.,  a  =  H^Ax),  repeating  the 
same  steps  as  above,  we  can  easily  derive  the  following: 

lim  Pr(z  (AO  =  H(A0X  +  e(N)  +  h^^(Ax  -  y))  =  1.  (4.36) 

IT2— >-0 

Therefore,  the  proof  is  complete.  ■ 
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CHAPTER  5 

CONCLUSIONS 

In  this  dissertation,  we  studied  attacks  and  countermeasures  in  communica¬ 
tions  and  power  networks.  Specifically,  stepping-stone  attacks  in  communications 
networks  and  data  attacks  in  power  networks  were  considered. 

Although  the  attack  framework  varied  significantly  depending  on  the  network 
type  and  the  adversary’s  goal,  there  were  some  common  observations.  In  partic¬ 
ular,  it  was  commonly  observed  in  all  attack  problems  that  an  adversary  having 
enough  control  on  network  operations  can  achieve  its  goal  while  hiding  its  presence. 
Therefore,  a  challenging  task  for  a  network  administrator  is  to  find  an  affordable 
protection  strategy  that  can  prevent  such  undetectable  attacks.  The  main  contri¬ 
bution  of  dissertation  was  to  provide  conditions  on  the  adversary’s  ability  under 
which  attacks  can  be  detected,  build  network  protection  strategies  based  on  the 
detectability  condition,  and  study  what  attacks  can  achieve  if  a  network  is  not 
secured  properly. 

In  the  following  sections,  we  provide  concluding  remarks  for  each  topic  and 
comments  for  future  works. 

5.1  Detection  of  Information  Flows 

In  Chapter  2,  we  have  studied  timing-based  detection  of  information  flows  in  a 
network.  We  formulate  flow  detection  as  a  binary  composite  hypothesis  testing 
problem  and  present  a  detector  that  requires  neither  a  parametric  model  nor  a 
training  data  set.  The  detector  requires  a  constant  memory,  and  it  has  linear 
computational  complexity  with  respect  to  the  sample  size.  The  simulations  with 
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real- world  TCP  and  VoIP  traces  demonstrate  that  the  proposed  detector  is  superior 
to  the  benchmark  passive  detectors  and  more  suitable  for  the  nonparametric  and 
unsupervised  setting. 

The  test  results  with  the  real-world  traces  suggest  that  the  proposed  detector 
may  perform  well  in  a  more  practical  setting.  Nevertheless,  as  all  other  passive 
detectors,  our  detector  has  a  fundamental  limit  that  it  cannot  detect  the  flow  if  the 
presence  of  the  flow  generates  no  correlation  between  the  timing  measurements  at 

all. 


5.2  Topology  Attack  of  a  Power  Grid 

In  Chapter  3,  we  have  considered  undetectable  malicious  data  attack  aimed  at 
creating  a  false  topology  at  the  control  center.  We  obtain  a  necessary  and  sufficient 
condition  for  an  attack  launched  by  a  strong  attacker  to  be  undetectable.  We  also 
present  a  class  of  undetectable  line  removal  attacks  that  can  be  launched  by  weak 
attackers  with  only  local  information.  Finally,  we  present  a  countermeasure  against 
strong  attackers  by  protecting  a  subset  of  meters. 

Some  of  the  results  presented  in  Chapter  3  are  obtained  under  strong  conditions. 
Here,  we  mention  several  of  such  limitations  as  pointers  for  further  study.  First,  the 
DC  model  assumed  in  Section  3.3  makes  the  results  valid  only  near  the  operating 
point.  It  has  been  demonstrated  in  [39]  that  the  DC  model  tends  to  exaggerate 
the  effect  of  state  attacks,  and  the  nonlinear  state  estimator  has  the  ability  to 
significantly  reduce  the  attacks’  impact  on  the  state  estimate.  Obtaining  conditions 
for  undetectable  topology  attacks  under  the  AC  model  is  of  considerable  interest. 
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Second,  we  have  focused  mostly  on  state-preserving  topology  attacks.  Even 
though  such  attacks  are  optimal  under  certain  scenarios,  to  understand  the  full 
implication  of  topology  attacks,  it  is  necessary  to  consider  attacks  that  affect  both 
topology  and  states. 

Finally,  we  consider  only  one  particular  form  of  countermeasure,  namely  im¬ 
plementing  authentication  at  a  subset  of  meters.  Other  mechanisms  should  be 
studied,  including  one  with  more  sophisticated  bad  data  detection  and  those  tak¬ 
ing  into  accounts  of  system  dynamics. 


5.3  Data  Framing  Attack  on  State  Estimation 

In  Chapter  4,  we  have  presented  the  data  framing  attack  on  power  system  state 
estimation.  Controlling  only  a  half  of  a  critical  set,  the  data  framing  attack  can 
perturb  the  state  estimate  by  an  arbitrary  degree.  A  theoretical  justification  was 
provided,  and  numerical  experiments  demonstrated  the  efficacy  of  the  framing 
attack. 

Our  results  indicate  that  most  known  countermeasures,  that  are  aimed  at 
merely  preventing  covert  state  attacks,  are  not  sufficient  for  protection  against  the 
attacks  aimed  at  state  perturbation.  In  designing  countermeasures,  the  possibility 
of  the  framing  attack  needs  to  be  taken  into  account. 

One  important  direction  for  future  work  is  to  End  an  easily  verihable  necessary 
condition  for  the  framing  attack  to  succeed  with  given  adversary  meters.  Such  a 
condition  is  essential  for  designing  a  countermeasure. 
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